This update is a continuation of our previous coverage of the SolarWinds supply-chain attack that was discovered by FireEye in December 2020. As of Jan. 11, 2021, new research has been published that expands the security community’s understanding of the breadth and depth of the SolarWinds attack.
Two recent developments warrant your attention:
New in-depth research from CrowdStrike provides technical analysis of the malware—dubbed "SUNSPOT" (the industry is going to run out of stellar-themed names at this rate)—that was used to insert the SUNBURST backdoor into SolarWinds Orion software builds.
On Monday, Jan. 11, 2021, CrowdStrike’s intelligence team published technical analysis on SUNSPOT, a newly identified type of malware that appears to have been used as part of the SolarWinds supply chain attack. CrowdStrike describes SUNSPOT as “a malicious tool that was deployed into the build environment to inject [the SUNBURST] backdoor into the SolarWinds Orion platform.”
While SUNSPOT infection is part of the attack chain that allows for SUNBURST backdoor compromise, SUNSPOT has distinct host indicators of attack (including executables and related files), artifacts, and TTPs (tactics, techniques, and procedures).
CrowdStrike provides a thorough breakdown of how SUNSPOT operates, including numerous indicators of compromise. Here are the critical highlights:
SUNSPOT’s on-disk executable is named taskhostsvc.exe and has an initial, likely build date of Feb. 20, 2020. It maintains persistence through a scheduled task that executes on boot and has the SeDebugPrivilege grant, which is what enables it to read the memory of other processes.
It uses this privilege to watch for MsBuild.exe (a Visual Studio development component) execution and modifies the target source code before the compiler has a chance to read it. SUNSPOT then looks for a specific Orion software source code component and replaces it with one that will inject SUNBURST during the build process. SUNSPOT also has validation checks to ensure no build errors are triggered during the build process, which helps it escape developer and other detection.
The last half of the CrowdStrike analysis has details on tactics, techniques, and procedures, along with host indicators of attack, ATT&CK framework mappings, and YARA rules specific to SUNSPOT. Relevant indicators have been incorporated into Rapid7’s SIEM, InsightIDR, and Managed Detection and Response instances and workflows.
SolarWinds has updated their blog with a reference to this new information on SUNSPOT. Because SUNSPOT, SUNBURST, and related tooling have not been definitively mapped to a known adversary, CrowdStrike has christened the actors responsible for these intrusions “StellarParticle.”
SUNBURST’s Kazuar lineage
Separately, Kaspersky Labs also published technical analysis on Monday, Jan. 11, 2020 that builds a case for a connection between the SUNBURST backdoor and another backdoor called Kazuar. Kazuar, which Palo Alto Networks’ Unit42 team first described in May of 2017 as a “multiplatform espionage backdoor with API access,” is a .NET backdoor that Kaspersky says appears to share several “unusual features” with SUNBURST. (Palo Alto linked Kazuar to the Turla APT group back in 2017, which Kaspersky says their own observations support, too.)
Shared features Kaspersky has identified so far include the use of FNV-1a hashing throughout Kazua and SUNBURST code, a similar algorithm used to generate unique victim identifiers, and customized (thought not exactly the same) implementations of a sleeping algorithm that delays between connections to a C2 server and makes network activity less obvious. Kaspersky has a full, extremely detailed list of similar and different features across both backdoors in their post.
Kaspersky does not definitively state that the two backdoors are the work of the same actor. Instead, they offer five possible explanations for the similarities they’ve identified between Kazuar and SUNBURST. The potential explanations below have been taken directly from their post:
Sunburst was developed by the same group as Kazuar.
The Sunburst developers adopted some ideas or code from Kazuar, without having a direct connection (they used Kazuar as an inspiration point).
Both groups, DarkHalo/UNC2452 and the group using Kazuar, obtained their malware from the same source.
Some of the Kazuar developers moved to another team, taking knowledge and tools with them.
The Sunburst developers introduced these subtle links as a form of false flag, in order to shift blame to another group.
As Kaspersky notes, the knowledge of a potential lineage connection to Kazaur changes little for defenders, but is worth keeping an eye on, as a confirmed connection may help those in more highly targeted sectors use previous Kazuar detection and prevention methods to enhance their response to the SolarWinds compromise.
NEVER MISS A BLOG
Get the latest stories, expertise, and news about security today.
2020 was not an easy year, but it challenged and taught us a lot. Let’s sum up the past year’s results together and make some plans for the next one.
Despite the worldwide lockdown and the entire team’s inability to work from the office, Zabbix continued to evolve and get better. The Zabbix 5.0 release was the first release to be done remotely – without the usual cake and “family” celebration on release day in our cozy office kitchen. Version 5.2 also followed remotely, with the already familiar online celebration in Zoom.
In 2020, we were going to have the biggest Zabbix Summit ever – the 10th-anniversary event. But plans changed, and instead, we held our first online Zabbix Summit. Given the unusual format of a traditional Zabbix Summit – it turned out very successful and proved the online version also could be sustainable. We are grateful to all the speakers and guests who supported us and joined the event.
We can say that 2020 was all about online events. We start a good tradition to hold Zabbix meetups online and received positive feedback from the community. Thanks to our partners’ support, online meetups have been held not only in English and Russian but also in Czech, Italian, French, Portuguese, Spanish, German, and Polish. We will continue this tradition for sure, providing our users worldwide to learn Zabbix, share their experience, and meet co-thinkers. Remember that we always record the meetups for you to have access to presentations when it is required. Recordings are available on our website’s event section under each event.
Conquering the world
One of the most important events of 2020 for Zabbix was the opening of the office in Brazil. The software and professional services are now even more accessible to Latin American users thanks to language localization and the branch office’s closeness. We are happy to be part of one of Zabbix’s most active communities worldwide and being able to cooperate on a much greater level. We have expanded our Latin American branch with new Zabbix professionals, partners, and initiatives during last year.
We are also proud of making Zabbix more reachable for different language users – with our Latin American office, we opened Zabbix Spanish and Portuguese web page and thanks to our active partners – also German site. Now we are working on more languages to make our open source solution even more open.
As for the overall results, we can share with you some statistics
Our integration team has been very active over the past year and has released many useful templates. In total, their efforts resulted in 42 new templates and 20 new Webhook integrations.
Zabbix partnership network has also grown. Zabbix Partnership program creates a worldwide network of trustful and highly skilled companies ready to provide immediate technical assistance and become an accessible intermediary between you and the Zabbix team. This network has now grown by 40+ partners worldwide, resulting in around 200 partners for now.
In the past year, we rethought the professional training program and expanded it with extra training – one-day courses for in-depth study of one specific topic. These courses differ from the rest of the program by having no requirements. You can get extra knowledge about monitoring with Zabbix without Zabbix certification. Four additional training courses are available now, but we are working on the next topics to provide you.
Note that we have already published many training sessions for this year, so now is an excellent time to choose and book the courses and timing that suit you best. Don’t forget to use the filter – you can quickly sort courses by language, level, and location (online or on-site).
Speaking of statistics for last year, thanks to the Zabbix team and partners’ efforts, there were 203 training sessions in total. Altogether, 1,214 official Zabbix certificates were awarded to the training attendees who successfully mastered the material and passed the exam.
But education doesn’t end with training. Webinars are an excellent opportunity to learn different aspects of Zabbix for free. Note that we have also added the ability to watch recorded sessions at your convenience. And we also want to remind you not to forget to use the filter to navigate between sessions. We cover many topics in many different languages, and we want to make sure you don’t miss out on learning the material. Last year we hosted 245 webinar sessions (including 55 sessions held by Zabbix partners worldwide), and we hope it was beneficial for our community.
A sneak peek to 2021
So what can we expect from Zabbix in the new year of 2021? Lots of things! We are not going to stop and continue to work hard. The Zabbix software development roadmap is available on our website. Feel free to explore the features we are working on right now! We will continue to hold meetups, covering complex technical issues. As for the Zabbix Summit 2021, we are also planning to have it online in October 2021. The first experience was a success, so why not strengthen the achieved result. And as for other news, keep an eye on our updates here on the blog, social media, and newsletter. We have an exciting year ahead of us, so let’s make the most of it!
Today I am happy to announce AWS Transfer Family now also supports file transfers to Amazon Elastic File System (EFS) file systems as well as Amazon S3. This feature enables you to easily and securely provide your business partners access to files stored in Amazon EFS file systems. With this launch, you now have the option to store the transferred files in a fully managed file system and reduce your operational burden, while preserving your existing workflows that use SFTP, FTPS, or FTP protocols.
Using Amazon EFS – Getting Started To get started in your existing Amazon EFS file system, make sure the POSIX identities you assign for your SFTP/FTPS/FTP users are owners of the files and directories you want to provide access to. You will provide access to that Amazon EFS file system through a resource-based policy. Your role also needs to establish a trust relationship. This trust relationship allows AWS Transfer Family to assume the AWS Identity and Access Management (IAM) role to access your bucket so that it can service your users’ file transfer requests.
You will also need to make sure you have created a mount target for your file system. In the example below, the home directory is owned by userid 1234 and groupid 5678.
$ mkdir home/myname
$ chown 1234:5678 home/myname
When you create a server in the AWS Transfer Family console, select Amazon EFS as your storage service in the Step 4 section Choose a domain.
When the server is enabled and in an online state, you can add users to your server. On the Servers page, select the check box of the server that you want to add a user to and choose Add user.
In the User configuration section, you can specify the username, uid (e.g. 1234), gid (e.g 5678), IAM role, and Amazon EFS file system as user’s home directory. You can optionally specify a directory within the file system which will be the user’s landing directory. You use a service-managed identity type – SSH keys. If you want to use password type, you can use a custom option with AWS Secrets Manager.
Amazon EFS uses POSIX IDs which consist of an operating system user id, group id, and secondary group id to control access to a file system. When setting up your user, you can specify the username, user’s POSIX configuration, and an IAM role to access the EFS file system. To learn more about configuring ownership of sub-directories in EFS, visit the documentation.
Once the users have been configured, you can transfer files using the AWS Transfer Family service by specifying the transfer operation in a client. When your user authenticates successfully using their file transfer client, it will be placed directly within the specified home directory, or root of the specified EFS file system.
$ sftp [email protected]
sftp> cd /fs-23456789/home/myname
sftp> ls -l
-rw-r--r-- 1 3486 1234 5678 Jan 04 14:59 my-file.txt
sftp> put my-newfile.txt
sftp> ls -l
-rw-r--r-- 1 3486 1234 5678 Jan 04 14:59 my-file.txt
-rw-r--r-- 1 1002 1234 5678 Jan 04 15:22 my-newfile.txt
Most of SFTP/FTPS/FTP commands are supported in the new EFS file system. You can refer to a list of available commands for FTP and FTPS clients in the documentation.
Supported including resolving symlinks
Supported (only file)
Supported (file or folder)
Supported (root only)
Supported (root only)
Supported (root or owner only)
Supported (non-empty folders only)
You can use Amazon CloudWatch to track your users’ activity for file creation, update, delete, read operations, and metrics for data uploaded and downloaded using your server. To learn more on how to enable CloudWatch logging, visit the documentation.
Available Now AWS Transfer Family support for Amazon EFS file systems is available in all AWS Regions where AWS Transfer Family is available. There are no additional AWS Transfer Family charges for using Amazon EFS as the storage backend. With Amazon EFS storage, you pay only for what you use. There is no need to provision storage in advance and there are no minimum commitments or up-front fees.
To learn more, take a look at the FAQs and the documentation. Please send feedback to the AWS forum for AWS Transfer Family or through your usual AWS support contacts.
Learn all the details about AWS Transfer Family to access Amazon EFS file systems and get started today.
We want to make it easier and more cost-effective for you to add maps, location awareness, and other location-based features to your web and mobile applications. Until now, doing this has been somewhat complex and expensive, and also tied you to the business and programming models of a single provider.
Introducing Amazon Location Service Today we are making Amazon Location available in preview form and you can start using it today. Priced at a fraction of common alternatives, Amazon Location Service gives you access to maps and location-based services from multiple providers on an economical, pay-as-you-go basis.
You can use Amazon Location Service to build applications that know where they are and respond accordingly. You can display maps, validate addresses, perform geocoding (turn an address into a location), track the movement of packages and devices, and much more. You can easily set up geofences and receive notifications when tracked items enter or leave a geofenced area. You can even overlay your own data on the map while retaining full control.
All About Amazon Location Let’s take a look at the types of resources that Amazon Location Service makes available to you, and then talk about how you can use them in your applications.
Maps – Amazon Location Service lets you create maps that make use of data from our partners. You can choose between maps and map styles provided by Esri and by HERE Technologies, with the potential for more maps & more styles from these and other partners in the future. After you create a map, you can retrieve a tile (at one of up to 16 zoom levels) using the GetMapTile function. You won’t do this directly, but will use Mapbox GL, Tangram, or another library instead.
Place Indexes – You can choose between indexes provided by Esri and HERE. The indexes support the SearchPlaceIndexForPosition function which returns places, such as residential addresses or points of interest (often known as POI) that are closest to the position that you supply, while also performing reverse geocoding to turn the position (a pair of coordinates) into a legible address. Indexes also support the SearchPlaceIndexForText function, which searches for addresses, businesses, and points of interest using free-form text such as an address, a name, a city, or a region.
Trackers –Trackers receive location updates from one or more devices via the BatchUpdateDevicePosition function, and can be queried for the current position (GetDevicePosition) or location history (GetDevicePositionHistory) of a device. Trackers can also be linked to Geofence Collections to implement monitoring of devices as they move in and out of geofences.
Geofence Collections – Each collection contains a list of geofences that define geographic boundaries. Here’s a geofence (created with geojson.io) that outlines a park near me:
Amazon Location in Action I can use the AWS Management Console to get started with Amazon Location and then move on to the AWS Command Line Interface (CLI) or the APIs if necessary. I open the Amazon Location Service Console, and I can either click Try it! to create a set of starter resources, or I can open up the navigation on the left and create them one-by-one. I’ll go for one-by-one, and click Maps:
Then I click Create map to proceed:
I enter a Name and a Description:
Then I choose the desired map and click Create map:
The map is created and ready to be added to my application right away:
Next, I want to track the position of devices so that I can be notified when they enter or exit a given region. I use a GeoJSON editing tool such as geojson.io to create a geofence that is built from polygons, and save (download) the resulting file:
I click Create geofence collection in the left-side navigation, and in Step 1, I add my GeoJSON file, enter a Name and Description, and click Next:
Now I enter a Name and a Description for my tracker, and click Next. It will be linked to the geofence collection that I just created:
The next step is to arrange for the tracker to send events to Amazon EventBridge so that I can monitor them in CloudWatch Logs. I leave the settings as-is, and click Next to proceed:
I review all of my choices, and click Finalize to move ahead:
The resources are created, set up, and ready to go:
I can then write code or use the CLI to update the positions of my devices:
I can write Amazon EventBridge rules that watch for the events, and use them to perform any desired processing. Events are published when a device enters or leaves a geofenced area, and look like this:
Finally, I can create and use place indexes so that I can work with geographical objects. I’ll use the CLI for a change of pace. I create the index:
$ aws location create-place-index \
--index-name MyIndex1 --data-source Here
Then I query it to find the addresses and points of interest near the location:
$ aws location search-place-index-for-position --index-name MyIndex1 \
--position "[-122.33805,47.62748]" --output json \
| jq .Results.Place.Label
"Terry Ave N, Seattle, WA 98109, United States"
"900 Westlake Ave N, Seattle, WA 98109-3523, United States"
"851 Terry Ave N, Seattle, WA 98109-4348, United States"
"860 Terry Ave N, Seattle, WA 98109-4330, United States"
"Seattle Fireboat Duwamish, 860 Terry Ave N, Seattle, WA 98109-4330, United States"
"824 Terry Ave N, Seattle, WA 98109-4330, United States"
"9th Ave N, Seattle, WA 98109, United States"
I can also do a text-based search:
$ aws location search-place-index-for-text --index-name MyIndex1 \
--text Coffee --bias-position "[-122.33805,47.62748]" \
--output json | jq .Results.Place.Label
"Mohai Cafe, 860 Terry Ave N, Seattle, WA 98109, United States"
"Starbucks, 1200 Westlake Ave N, Seattle, WA 98109, United States"
"Metropolitan Deli and Cafe, 903 Dexter Ave N, Seattle, WA 98109, United States"
"Top Pot Doughnuts, 590 Terry Ave N, Seattle, WA 98109, United States"
"Caffe Umbria, 1201 Westlake Ave N, Seattle, WA 98109, United States"
"Starbucks, 515 Westlake Ave N, Seattle, WA 98109, United States"
"Cafe 815 Mercer, 815 9th Ave N, Seattle, WA 98109, United States"
"Victrola Coffee Roasters, 500 Boren Ave N, Seattle, WA 98109, United States"
"Specialty's, 520 Terry Ave N, Seattle, WA 98109, United States"
Both of the searches have other options; read the Geocoding, Reverse Geocoding, and Search to learn more.
Things to Know Amazon Location is launching today as a preview, and you can get started with it right away. During the preview we plan to add an API for routing, and will also do our best to respond to customer feedback and feature requests as they arrive.
Pricing is based on usage, with an initial evaluation period that lasts for three months and lets you make numerous calls to the Amazon Location APIs at no charge. After the evaluation period you pay the prices listed on the Amazon Location Pricing page.
Amazon Location is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo) Regions.
Today, I’m particularly happy to announce FreeRTOS Long Term Support (LTS). FreeRTOS is an open source, real-time operating system for microcontrollers that makes small, low-power edge devices easy to program, deploy, secure, connect, and manage. LTS releases offer a more stable foundation than standard releases as manufacturers deploy and later update devices in the field. As we have planned, LTS is now included in the FreeRTOS kernel and a set of FreeRTOS libraries needed for embedded and IoT applications, and for securely connecting microcontroller-based (MCU) devices to the cloud.
Embedded developers at original equipment manufacturers (OEMs) and MCU vendors using FreeRTOS to build long-lived applications on IoT devices now get the predictability and feature stability of an LTS release without compromising access to critical security updates. The FreeRTOS 202012.00 LTS release applies to the FreeRTOS kernel, connectivity libraries (FreeRTOS+TCP, coreMQTT, coreHTTP), security library (PKCS #11 implementation), and AWS library (AWS IoT Device Shadow).
We will provide security updates and critical bug fixes for all these libraries until December 31, 2022.
Benefits of FreeRTOS LTS Embedded developers at OEMs who want to use FreeRTOS libraries for their long-lived applications want to benefit from security updates and bug fixes in the latest FreeRTOS mainline releases. Mainline releases can introduce both new features and critical fixes, which may increase time and effort for users to include only fixes.
An LTS release provides years of feature stability of included libraries. With an LTS release, any update will not change public APIs, file structure, or build processes that could require changes to your application. Security updates and critical bug fixes will be backported at least until Dec 31, 2022. LTS releases contain updates that only address critical issues including security vulnerabilities. Therefore, the integration of LTS releases is less disruptive to customers’ development and integration efforts as they approach and move into production. For MCU vendors, this means reduced effort in integrating a stable code base and faster time to market with vendors’ latest libraries.
Available Now The FreeRTOS 202012.00 LTS release is available now to download. To learn more, visit FreeRTOS LTS and the documentation. Please send us feedback on the Github repository and the forum of FreeRTOS.
I am happy to announce AWS IoT Greengrass 2.0, a new version of AWS IoT Greengrass that makes it easy for device builders to build, deploy, and manage intelligent device software. AWS IoT Greengrass 2.0 provides an open source edge runtime, a rich set of pre-built software components, tools for local software development, and new features for managing software on large fleets of devices.
AWS IoT Greengrass 2.0 edge runtime is now open source under an Apache 2.0 license, and available on Github. Access to the source code allows you to more easily integrate your applications, troubleshoot problems, and build more reliable and performant applications that use AWS IoT Greengrass.
You can add or remove pre-built software components based on your IoT use case and your device’s CPU and memory resources. For example, you can choose to include pre-built AWS IoT Greengrass components such as stream manager only when you need to process data streams with your application, or machine learning components only when you want to perform machine learning inference locally on your devices.
The AWS IoT Greengrass IoT Greengrass 2.0 includes a new command-line interface (CLI) that allows you to locally develop and debug applications on your device. In addition, there is a new local debug console that helps you visually debug applications on your device. With these new capabilities, you can rapidly develop and debug code on a test device before using the cloud to deploy to your production devices.
AWS IoT Greengrass 2.0 is also integrated with AWS IoT thing groups, enabling you to easily organize your devices in groups and manage application deployments across your devices with features to control rollout rates, timeouts, and rollbacks.
AWS IoT Greengrass 2.0 – Getting Started Device builders can use AWS IoT Greengrass 2.0 by going to the AWS IoT Greengrass console where you can find a download and install command that you run on your device. Once the installer is downloaded to the device, you can use it to install Greengrass software with all essential features, register the device as an AWS IoT Thing, and create a simple “hello world” software component in less than 10 minutes.
To get started in the AWS IoT Greengrass console, you first register a test device by clicking Set up core device. You assign the name and group of your core device. To deploy to only the core device, select No group. In the next step, install the AWS IoT Greengrass Core software in your device.
When the installer completes, you can find your device in the list of AWS IoT Greengrass Core devices on the Core devices page.
AWS IoT Greengrass components enable you to develop and deploy software to your AWS IoT Greengrass Core devices. You can write your application functionality and bundle it as a private component for deployment. AWS IoT Greengrass also provides public components, which provide pre-built software for common use cases that you can deploy to your devices as you develop your device software. When you finish developing the software for your component, you can register it with AWS IoT Greengrass. Then, you can deploy and run the component on your AWS IoT Greengrass Core devices.
To create a component, click the Create component button on the Components page. You can use a recipe or import an AWS Lambda function. The component recipe is a YAML or JSON file that defines the component’s details, dependencies, compatibility, and lifecycle. To learn about the specifications, visit the recipe reference guide.
Here is an example of a YAML recipe.
When you finish developing your component, you can add it to a deployment configuration to deploy to one or more core devices. To create a new deployment or configure the components to deploy to core devices, click the Create button on the Deployments page. You can deploy to a core device or a thing group as a target, and select the components to deploy. The deployment includes the dependencies for each component that you select.
You can edit the version and parameters of selected components and advanced settings such as the rollout configuration, which defines the rate at which the configuration deploys to the target devices; timeout configuration, which defines the duration that each device has to apply the deployment; or cancel configuration, which defines when to automatically stop the deployment.
Moving to AWS IoT Greengrass 2.0 Existing devices running AWS IoT Greengrass 1.x will continue to run without any changes. If you want to take advantage of new AWS IoT Greengrass 2.0 features, you will need to move your existing AWS IoT Greengrass 1.x devices and workloads to AWS IoT Greengrass 2.0. To learn how to do this, visit the migration guide.
After you move your 1.x applications over, you can start adding components to your applications using new version 2 features, while leaving your version 1 code as-is until you decide to update them.
AWS IoT Greengrass 2.0 Partners At launch, industry-leading partners NVIDIA and NXP have qualified a number of their devices for AWS IoT Greengrass 2.0:
Starting today, to help you evaluate, test, and develop with this new release of AWS IoT Greengrass, the first 1,000 devices in your account will not incur any AWS IoT Greengrass charges until December 31, 2021. For pricing information, check out the AWS IoT Greengrasspricing page.
Today, I am happy to announce AWS IoT Core for LoRaWAN, a new fully-managed feature that allows AWS IoT Core customers to connect and manage wireless devices that use low-power long-range wide area network (LoRaWAN) connectivity with the AWS Cloud.
Using AWS IoT Core for LoRaWAN, customers can now set up a private LoRaWAN network by connecting their own LoRaWAN devices and gateways to the AWS Cloud – without developing or operating a LoRaWAN Network Server (LNS) by themselves. The LNS is required to manage LoRaWAN devices and gateways’ connection to the cloud; gateways serve as a bridge and carry device data to and from the LNS, usually over Wi-Fi or Ethernet.
This allows customers to eliminate the undifferentiated work and operational burden of managing an LNS, and enables them to easily and quickly connect and secure LoRaWAN device fleets at scale.
Combined with the long range and deep in-building coverage provided by LoRa technology, AWS IoT Core now enables customers to accelerate IoT application development using AWS services and acting on the data generated easily from connected LoRaWAN devices.
Customers – mostly enterprises – need to develop IoT applications using devices that transmit data over long range (1-3 miles of urban coverage or up to 10 miles for line-of-sight) or through the walls and floors of buildings, for example for real-time asset tracking at airports, remote temperature monitoring in buildings, or predictive maintenance of industrial equipment. Such applications also require devices to be optimized for low-power consumption, so that batteries can last several years without replacement, thus making the implementation cost-effective. Given the extended coverage of LoRaWAN connectivity, it is attractive to enterprises for these use cases, but setting up LoRaWAN connectivity in a privately managed site requires customers to operate an LNS.
With AWS IoT Core for LoRaWAN, you can connect LoRaWAN devices and gateways to the cloud with a few simple steps in the AWS IoT Management Console, thus speeding up the network setup time, and connect off-the-shelf LoRaWAN devices, without any requirement to modify embedded software, for a plug and play experience.
AWS IoT Core for LoRaWAN – Getting Started Getting started with a LoRaWAN network setup is easy. You can find AWS IoT Core for LoRaWAN qualified gateways and developer kits from the AWS Partner Device Catalog. AWS qualified gateways and developer kits are pre-tested and come with a step by step guide from the manufacturer on how to connect it with AWS IoT Core for LoRaWAN.
With AWS IoT Core console, you can register the gateways by providing a gateway’s unique identifier (provided by the gateway vendor) and selecting LoRa frequency band. For registering devices, you can input device credentials (identifiers and security keys provided by the device vendor) on the console.
Each device has a Device Profile that specifies the device capabilities and boot parameters the LNS requires to set up LoRaWAN radio access service. Using the console, you can select a pre-populated Device Profile or create a new one.
A destination automatically routes messages from LoRaWAN devices to AWS IoT Rules Engine. Once a destination is created, you can use it to map multiple LoRaWAN devices to the same IoT rule. You can write rules using simple SQL queries, to transform and act on the device data, like converting data from proprietary binary to JSON format, raising alerts, or routing it to other AWS services like Amazon Simple Storage Service (S3). From the console, you can also query metrics for connected devices and gateways to troubleshoot connectivity issues.
Available Now AWS IoT Core for LoRaWAN is available today in US East (N. Virginia) and Europe (Ireland) Regions. With pay-as-you-go pricing and no monthly commitments, you can connect and scale LoRaWAN device fleets reliably, and build applications with AWS services quickly and efficiently. For more information, see the pricing page.
Observability is an essential aspect of running cloud infrastructure at scale. You need to know that your resources are healthy and performing as expected, and that your system is delivering the desired level of performance to your customers.
A lot of challenges arise when monitoring container-based applications. First, because container resources are transient and there are lots of metrics to watch, the monitoring data has strikingly high cardinality. In plain language this means that there are lots of unique values, which can make it harder to define a space-efficient storage model and to create queries that return meaningful results. Second, because a well-architected container-based system is composed using a large number of moving parts, ingesting, processing, and storing the monitoring data can become an infrastructure challenge of its own.
Prometheus is a leading open-source monitoring solution with an active developer and user community. It has a multi-dimensional data model that is a great fit for time series data collected from containers.
Introducing Amazon Managed Service for Prometheus (AMP) Today we are launching a preview of Amazon Managed Service for Prometheus (AMP). This fully-managed service is 100% compatible with Prometheus. It supports the same metrics, the same PromQL queries, and can also make use of the 150+ Prometheus exporters. AMP runs across multiple Availability Zones for high availability, and is powered by CNCF Cortex for horizontal scalability. AMP will easily scale to ingest, store, and query millions of time series metrics.
Getting Started with Amazon Managed Service for Prometheus (AMP) After joining the preview, I open the AMP Console, enter a name for my AMP workspace, and click Create to get started (API and CLI support is also available):
My workspace is active within a minute or so. The console provides me with the endpoints that I can use to write data to my workspace, and to issue queries:
It also provides guidance on how to configure an existing Prometheus server to send metrics to the AMP workspace:
A desire for consolidated, and simplified operational oversight isn’t limited to just cloud infrastructure. Increasingly, customers ask us for a “single pane of glass” approach for also monitoring and managing their application portfolios.
These customers tell us that detection and investigation of application issues takes additional time and effort, due to the typical use of multiple consoles, tools, and sources of information such as resource usage metrics, logs, and more, to enable their DevOps engineers to obtain context about the application issue under investigation. Here, an “application” means not just the application code but also the logical group of resources that act as a unit to host the application, along with ownership boundaries for operators, and environments such as development, staging, and production.
Today, I’m pleased to announce a new feature of AWS Systems Manager, called Application Manager. Application Manager aggregates operational information from multiple AWS services and Systems Manager capabilities into a single console, making it easier to view operational data for your applications.
A particular benefit of automated discovery is that application components and resources are automatically kept up-to-date on an ongoing basis, but you can also always revise applications as needed by adding or deleting components manually.
With applications discovered and consolidated into a single console, you can more easily diagnose operational issues and resolve them with minimal time and effort. Automated runbooks targeting an application component or resource can be run to help remediate operational issues. For any given application, you can select a resource and explore relevant details without needing to leave the console.
For example, the application can surface Amazon CloudWatch logs, operational metrics, AWS CloudTrail logs, and configuration changes, removing the need to engage with multiple tools or consoles. This means your on-call engineers can understand issues more quickly and reduce the time needed to resolve them.
Exploring an Application with Application Manager I can access Application Manager from the Systems Manager home page. Once open, I get an overview of my discovered applications and can see immediately that there are some alarms, without needing to switch context to the Amazon CloudWatch console, and some operations items (“OpsItems”) that I might need to pay attention to. I can also switch to the Applications tab to view the collections of applications, or I can click the buttons in the Applications panel for the collection I’m interested in.
In the screenshot below, I’ve navigated to a sample application and again, have indicators showing that alarms have raised. The various tabs enable me to drill into more detail to view resources used by the application, config resource and rules compliance, monitoring alarms, logs, and automation runbooks associated with the application.
Clicking on the Alarm indicator takes me into the Monitoring tab, and it shows that the ConsumedWriteCapacityUnits alarm has been raised. I can change the timescale to zero in on when the event occurred, or I can use the View recent alarms dashboard link to jump into the Amazon CloudWatch Alarms console to view more detail.
The Logs tab shows me a consolidated list of log groups for the application, and clicking a log group name takes me directly to the CloudWatch Logs where I can inspect the log streams, and take advantage of Log Insights to dive deeper by querying the log data.
OpsItems shows me operational issues associated with the resources of my application, and enables me to indicate the current status of the issue (open, in progress, resolved). Below, I am marking investigation of a stopped EC2 instance as in progress.
Finally, Runbooks shows me automation documents associated with the application and their execution status. Below, it’s showing that I ran the AWS-RestartEC2Instance automation document to restart the EC2 instance that was stopped, and I would now resolve the issue logged in the OpsItems tab.
Consolidating this information into a single console gives engineers a single starting location to monitor and investigate issues arising with their applications, and automatic discovery of applications and resources makes getting started simple. AWS Systems ManagerApplication Manager is available today, at no extra charge, in all public AWS Regions where Systems Manager is available.
Organizations, and their systems administrators, routinely face challenges in managing increasingly diverse portfolios of IT infrastructure across cloud and on-premises environments. Different tools, consoles, services, operating systems, procedures, and vendors all contribute to complicate relatively common, and related, management tasks. As workloads are modernized to adopt Linux and open-source software, those same systems administrators, who may be more familiar with GUI-based management tools from a Windows background, have to continually adapt and quickly learn new tools, approaches, and skill sets.
AWS Systems Manager is an operational hub enabling you to manage resources on AWS and on-premises. Available today, Fleet Manager is a new console based experience in Systems Manager that enables systems administrators to view and administer their fleets of managed instances from a single location, in an operating-system-agnostic manner, without needing to resort to remote connections with SSH or RDP. As described in the documentation, managed instances includes those running Windows, Linux, and macOS operating systems, in both the AWS Cloud and on-premises. Fleet Manager gives you an aggregated view of your compute instances regardless of where they exist.
All that’s needed, whether for cloud or on-premises servers, is the Systems Manager agent installed on each server to be managed, some AWS Identity and Access Management (IAM) permissions, and AWS Key Management Service (KMS) enabled for Systems Manager‘s Session Manager. This makes it an easy and cost-effective approach for remote management of servers running in multiple environments without needing to pay the licensing cost of expensive management tools you may be using today. As noted earlier, it also works with instances running macOS. With the agent software and permissions set up, Fleet Manager enables you to explore and manage your servers from a single console environment. For example, you can navigate file systems, work with the registry on Windows servers, manage users, and troubleshoot logs (including viewing Windows event logs) and monitor common performance counters without needing the Amazon CloudWatch agent to be installed.
Exploring an Instance With Fleet Manager To get started exploring my instances using Fleet Manager, I first head to the Systems Manager console. There, I select the new Fleet Manager entry on the navigation toolbar. I can also select the Managed Instances option – Fleet Manager replaces Managed Instances going forward, but the original navigation toolbar entry will be kept for backwards compatibility for a short while. But, before we go on to explore my instances, I need to take you on a brief detour.
When you select Fleet Manager, as with some other views in Systems Manager, a check is performed to verify that a role, named AmazonSSMRoleForInstancesQuickSetup, exists in your account. If you’ve used other components of Systems Manager in the past, it’s quite possible that it does. The role is used to permit Systems Manager to access your instances on your behalf and if the role exists, then you’re directed to the requested view. If however the role doesn’t exist, you’ll first be taken to the Quick Setup view. This in itself will trigger creation of the role, but you might want to explore the capabilities of Quick Setup, which you can also access any time from the navigation toolbar.
Quick Setup is a feature of Systems Manager that you can use to set up specific configuration items, such as the Systems Manager and CloudWatch agents on your instances (and keep them up-to-date), and also IAM roles permitting access to your resources for Systems Manager components. For this post, all the instances I’m going to use already have the required agent set up, including the role permissions, so I’m not going to discuss this view further but I encourage you to check it out. I also want to remind you that to take full advantage of Fleet Manager‘s capabilities you first need to have KMS encryption enabled for your instances and secondly, the role attached to your Amazon Elastic Compute Cloud (EC2) instances must have the kms:Decrypt role permission included, referencing the key you selected when you enabled KMS encryption. You can enable encryption, and select the KMS key, using the Preferences section of the Session Manager console, and of course you can set up the role permission in the IAM console.
That’s it for the diversion; if you have the role already, as I do, you’ll now be at the Managed instances list view. If you’re at Quick Setup instead, simply click the Fleet Manager navigation button once more.
The Managed instances view shows me all of my instances, in the cloud or on-premises, that I can access. Selecting an instance, in this case an EC2 Windows instance launched using AWS Elastic Beanstalk, and clicking Instance actions presents me with a menu of options. The options (less those specific to Windows) are available for my Amazon Linux instance too, and for instances running macOS I can use the View file system option.
The File system view displays a read-only view onto the file system of the selected instance. This can be particularly useful for viewing text-based log files, for example, where I can preview up to 10,000 lines of a log file and even tail it to view changes as the log updates. I used this to open and tail an IIS web server log on my Windows Server instance. Having selected the instance, I next select View file system from the Instance actions dropdown (or I can click the Instance ID to open a view onto that instance and select File system from the menu displayed on the instance view).
Having opened the file system view for my instance, I navigate to the folder on the instance containing the IIS web server logs.
Selecting a log file, I then click Actions and select Tail file. This opens a view onto the log file contents, which updates automatically as new content is written.
As I mentioned, the File system view is also accessible for macOS-based instances. For example, here is a screenshot of viewing the Applications folder on an EC2 macOS instance.
Next, let’s examine the Performance counters view, which is available for both Windows and Linux instances. This view displays CPU, memory, network traffic, and disk I/O and will be familiar to Windows users from Task Manager. The metrics shown reflect the guest OS metrics, whereas EC2 instance metrics you may be used to relate to the hypervisor. On this particular instance I’ve deployed an ASP.NET Core 5 application, which generates a varying length collection of Fibonacci numbers on page refresh. Below is a snapshot of the counters, after I’ve put the instance under a small amount of load. The view updates automatically every 5 seconds.
There are more views available than I have space for in this post. Using the Windows Registry view, I can view and edit the registry on the selected Windows instance. Windows event logs gives me access to the Application and Service logs, and common Windows logs such as System, Setup, Security, etc. With Users and groups I can manage users or groups, including assignment of users to groups (again for both Windows and Linux instances). For all views, Fleet Manager enables me to use a single and convenient console.
Getting Started AWS Systems ManagerFleet Manager is available today for use with managed instances running Windows, Linux, and macOS. Information on pricing, for this and other Systems Manager features, can be found at this page.
Because you are constantly listening to the feedback from your customer, you are iterating, innovating, and improving your applications and infrastructures. You continually modify your IT systems in the cloud. And let’s face it, changing something in a working system risks breaking things or introducing side effects that are sometimes unpredictable; it doesn’t matter how many tests you do. On the other hand, not making changes is stasis, followed by irrelevance, followed by death.
This is why organizations of all sizes and types have embraced a culture of controlling changes. Some organizations adopt change management processes such as the ones defined in ITIL v4. Some have adopted DevOps’ Continuous Deployment, or other methods. In any case, to support your change management processes, it is important to have tools.
Using Change Manager has two primary advantages. First, it can improve the safety of changes made to application configurations and infrastructures, reducing the risk of service disruptions. It makes operational changes safer by tracking that only approved changes are being implemented. Secondly, it is tightly integrated with other AWS services, such as AWS Organizations and AWS Single Sign-On, or the integration with the Systems Manager change calendar and Amazon CloudWatch alarms.
Change Manager provides accountability with a consistent way to report and audit changes made across your organization, their intent, and who approved and implemented them.
Change Manager works across AWS Regions and multiple AWS accounts. It works closely with Organizations and AWS SSO to manage changes from a central point and to deploy them in a controlled way across your global infrastructure.
Terminology You can use AWS Systems Manager Change Manager on a single AWS account, but most of the time, you will use it in a multi-account configuration.
The way you manage changes across multiple AWS accounts depends on how these accounts are linked together. Change Manager uses the relationships between your accounts defined in AWS Organizations. When using Change Manager, there are three types of accounts:
The management account – also known as the “main account” or “root account.” The management account is the root account in an AWS Organizations hierarchy. It is the management account by virtue of this fact.
The delegated administrator account – A delegated administrator account is an account that has been granted permission to manage other accounts in Organizations. In the Change Manager context, this is the account from which change requests will be initiated. You will typically log in to this account to manage templates and change requests. Using a delegated administrators account allows you to limit connections made to the root account. It also allows you to enforce a least privileges policy by using a specific subset of permissions required by the changes.
The member accounts – Member accounts are accounts that are not the management account or a delegated administrator account, but are still included in Organizations. In my mental model for Change Manager, these would be the accounts that hold the resources where changes are deployed. A delegated administrator account would initiate a change request that would impact resources in a member account. System administrators are discouraged from logging directly into these accounts.
One-Time Configuration In this scenario, I show you how to use Change Manager with multiple AWS accounts linked together with Organizations. If you are not interested in the one-time configuration, jump to the Create a Change Request section below.
There are four one-time configuration actions to take before using Change Manager: one action in the root account and three in the delegated administrator account. In the root account, I use Quick Setup to define my delegated administrator account and initially configure permissions on the accounts. In the delegated administrator account, you define your source of user identities, you define what users have permissions to approve change templates, and you define a change request template.
First, I ensure I have an Organization in place and my AWS accounts are organized in Organizational Units (OU). For the purpose of this simple example, I have three accounts: the root account, the delegated administrator account in the management OU and a member account in the managed OU. When ready, I use Quick Setup on the root account to configure my accounts. There are multiple paths leading to Quick Setup; for this demo, I use the blue banner on top of the Quick Setup console, and I click Setup Change Manager.
On the Quick Setup page, I enter the ID of the delegated administrator account if I haven’t defined it already. Then I choose the permissions boundaries I grant to the delegated administrator account to perform changes on my behalf. This is the maximum permissions Change Manager receives to make changes. I will further restrict this permission set when I create change requests in a few minutes. In this example, I grant Change Manager permissions to call any ec2 API. This effectively authorizes Change Manager to only run changes related to EC2 instances.
Lower on the screen, I choose the set of accounts that are targets for my changes. I choose between Entire organization or Custom to select one or multiple OUs.
After a while, Quick Setup finishes configuring my AWS accounts permission and I can move to the second part of the one-time setup.
Second, I switch to my delegated administrator account. Change Manager asks me how I manage users in my organization: with AWS Identity and Access Management (IAM) or AWS Single Sign-On? This defines where Change Manager pulls user identities when I choose approvers. This is a one-time configuration option. This can be changed at any time in the Change ManagerSettings page.
Third, on the same page, I define an Amazon Simple Notification Service (SNS) topic to receive notifications about template reviews. This channel is notified any time a template is created or modified, to let template approvers review and approve templates. I also define the IAM (or SSO) user with permission to approve change templates (more about these in one minute).
Finally, I define a change template. Every change request is created from a template. Templates define common parameters for all change requests based on them, such as the change request approvers, the actions to perform, or the SNS topic to send notifications of progress. You can enforce the review and approval of templates before they can be used. It makes sense to create multiple templates to handle different type of changes. For example, you can create one template for standard changes, and one for emergency changes that overrides the change calendar. Or you can create different templates for different types of automation run books (documents).
To help you to get started, we created a template for you: the “Hello World” template. You can use it as a starting point to create a change request and test out your approval flow.
At any time, I can create my own template. Let’s imagine my system administrator team is frequently restarting EC2 instances. I create a template allowing them to create change requests to restart one or multiple instances. Using the delegated administrator account, I navigate to the Change Manager management console and click Create template.
In a nutshell, a template defines the list of authorized actions, where to send notifications and who can approve the change request. Actions are an AWS Systems Managerrunbook. Emergency change templates allow change requests to bypass the change calendar I wrote about earlier. Under Runbook Options, I choose one or multiple runbooks allowed to run. For this example, I choose the AWS EC2RestartInstance runbook.
I use the console to create the template, but templates are defined internally as YAML. I can edit the YAML using the Editor tab, or when I am using the AWS Command Line Interface (CLI) or API. This means I can version control them just like the rest of my infrastructure (as code).
Just below, I document my template using text formatted as markdown format. I use this section to document the defining characteristics of the template and provide any necessary instructions, such as back-out procedures, to the requestor.
I scroll down that page and click Add Approver to define approvers. Approvers can be individual users or groups. The list of approvers are defined either at the template level or in the change request itself. I also choose to create an SNS topic to inform approvers when any requests are created that require their approval.
In the Monitoring section I select the alarm that, when active, stops any change based on this template, and initiate a rollback.
In the Notifications section, I select or create another SNS topic so I’m notified when status changes for this template occur.
Once I am done, I save the template and submit it for review.
Templates have to be reviewed and approved before they can be used. To approve the template, I connect the console as the template_approver user I defined earlier. As template_approver user, I see pending approvals on the Overview tab. Or, I navigate to the Templates tab, select the template I want to review. When I am done reviewing it, I click Approve.
Voila, now we’re ready to create change requests based on this template. Remember that all the preceding steps are one-time configurations and can be amended at any time. When existing templates are modified, the changes go through a review and approval process again.
Create a Change Request To create a change request on any account linked to the Organization, I open a AWS Systems Manager Change Managerconsole from the delegated administrator account and click Create request.
I choose the template I want to use and click Next.
I enter a name for this change request. The change is initiated immediately after all approvals are granted, or I specify an optional scheduled time. When the template allows me, I choose the approver for this change. In this example, the approver is defined by the template and cannot be changed. I click Next.
On the next screen, there are multiple important configuration options, relating to the actual execution of the change:
Target location – lets me define on which target AWS accounts and AWS Region I want to run this change.
Deployment target – lets me define which resources are the target of this change. One EC2 instance? Or multiple ones identified by their tags, their resources groups, a list of instance IDs, or all EC2 instances.
Runbook parameters – lets me define the parameters I want to pass to my runbook, if any.
Execution role – lets me define the set of permissions I grant the System Manager to deploy with this change. The permission set must have service changemanagement.ssm.amazonaws.com as principal for the trust policy. Selecting a role allows me to grant the Change Manager runtime a different permission set than the one I have.
Here is an example allowing Change Manager to stop an EC2 instance (you can scope it down to a specific AWS account, specific Region, or specific instances):
When I am ready, I click Next. On the last page, I review my data entry and click Submit for approval.
At this stage, the approver receives a notification, based on the SNS topic configured in the template. To continue this demo, I sign out of the console and sign in again as the cr_approver user, which I created, with permission to view and approve change requests.
As the cr_approver user, I navigate to the console, review the change request, and click Approve.
The change request status switches to scheduled, and eventually turns green to Success. At any time, I can click the change request to get the status, and to collect errors, if any.
I click on the change request to see the details. In particular, the Timeline tab shows the history of this CR.
Availability and Pricing AWS Systems Manager Change Manager is available today in all commercial AWS Regions, except mainland China. The pricing is based on two dimensions: the number of change requests you submit and the total number of API calls made. The number of change requests you submit will be the main cost factor. We will charge $0.29 per change request. Check the pricing page for more details.
You can evaluate Change Manager for free for 30 days, starting on your first change request.
No matter how much automation you have built, no matter how great you are at practicing Infrastructure as Code (IAC), and no matter how successfully you have transitioned from pets to cattle, you sometimes need to interact with your AWS resources at the command line. You might need to check or adjust a configuration file, make a quick fix to a production environment, or even experiment with some new AWS services or features.
Some of our customers feel most at home when working from within a web browser and have yet to set up or customize their own command-line interface (CLI). They tell is that they don’t want to deal with client applications, public keys, AWS credentials, tooling, and so forth. While none of these steps are difficult or overly time-consuming, they do add complexity and friction and we always like to help you to avoid both.
Introducing AWS CloudShell Today we are launching AWS CloudShell, with the goal of making the process of getting to an AWS-enabled shell prompt simple and secure, with as little friction as possible. Every shell environment that you run with CloudShell has the AWS Command Line Interface (CLI) (v2) installed and configured so you can run aws commands fresh out of the box. The environments also include the Python and Node runtimes, with many more to come in the future.
My shell sets itself up in a matter of seconds and I can issue my first aws command immediately:
The shell environment is based on Amazon Linux 2. I can store up to 1 GB of files per region in my home directory and they’ll be available each time I open a shell in the region. This includes shell configuration files such as .bashrc and shell history files.
I can access the shell via SSO or as any IAM principal that can login to the AWS Management Console, including federated roles. In order to access CloudShell, the AWSCloudShellFullAccess policy must be in effect. The shell runs as a normal (non-privileged) user, but I can sudo and install packages if necessary.
Here are a couple of features that you should know about:
Themes & Font Sizes – You can switch between light and dark color themes, and choose any one of five font sizes:
Tabs and Sessions – You can have multiple sessions open within the same region, and you can control the tabbing behavior, with options to split horizontally and vertically:
You can also download files from the shell environment to your desktop, and upload them from your desktop to the shell.
Things to Know Here are a couple of important things to keep in mind when you are evaluating CloudShell:
Timeouts & Persistence – Each CloudShell session will timeout after 20 minutes or so of inactivity, and can be reestablished by refreshing the window:
Regions – CloudShell is available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo) Regions, with the remaining regions on the near-term roadmap.
Persistent Storage – Files stored within $HOME persist between invocations of CloudShell with a limit of 1 GB per region; all other storage is ephemeral. This means that any software that is installed outside of $HOME will not persist, and that no matter what you change (or break), you can always begin anew with a fresh CloudShell environment.
Network Access – Sessions can make outbound connections to the Internet, but do not allow any type of inbound connections. Sessions cannot currently connect to resources inside of private VPC subnets, but that’s also on the near-term roadmap.
Tens of thousands of customers use Amazon EMR to run big data analytics applications on frameworks such as Apache Spark, Hive, HBase, Flink, Hudi, and Presto at scale. EMR automates the provisioning and scaling of these frameworks and optimizes performance with a wide range of EC2 instance types to meet price and performance requirements. Customer are now consolidating compute pools across organizations using Kubernetes. Some customers who manage Apache Spark on Amazon Elastic Kubernetes Service (EKS) themselves want to use EMR to eliminate the heavy lifting of installing and managing their frameworks and integrations with AWS services. In addition, they want to take advantage of the faster runtimes and development and debugging tools that EMR provides.
Today, we are announcing the general availability of Amazon EMR on Amazon EKS, a new deployment option in EMR that allows customers to automate the provisioning and management of open-source big data frameworks on EKS. With EMR on EKS, customers can now run Spark applications alongside other types of applications on the same EKS cluster to improve resource utilization and simplify infrastructure management.
Customers can deploy EMR applications on the same EKS cluster as other types of applications, which allows them to share resources and standardize on a single solution for operating and managing all their applications. Customers get all the same EMR capabilities on EKS that they use on EC2 today, such as access to the latest frameworks, performance optimized runtimes, EMR Notebooks for application development, and Spark user interface for debugging.
Amazon EMR automatically packages the application into a container with the big data framework and provides pre-built connectors for integrating with other AWS services. EMR then deploys the application on the EKS cluster and manages logging and monitoring. With EMR on EKS, you can get 3x faster performance using the performance-optimized Spark runtime included with EMR compared to standard Apache Spark on EKS.
When Amazon EKS clusters are registered, EMR workloads are deployed to Kubernates nodes and pods to manage application execution and auto-scaling, and sets up managed endpoints so that you can connect notebooks and SQL clients. EMR builds and deploys a performance-optimized runtime for the open source frameworks used in analytics applications.
To monitor and debug jobs, you can use inspect logs uploaded to your Amazon CloudWatch and Amazon Simple Storage Service (S3) location configured as part of monitoringConfiguration. You can also use the one-click experience from the console to launch the Spark History Server.
Integration with Amazon EMR Studio
Now you can submit analytics applications using AWS SDKs and AWS CLI, Amazon EMR Studio notebooks, and workflow orchestration services like Apache Airflow. We have developed a new Airflow Operator for Amazon EMR on EKS. You can use this connector with self-managed Airflow or by adding it to the Plugin Location with Amazon Managed Workflows for Apache Airflow.
You can also use newly previewed Amazon EMR Studio to perform data analysis and data engineering tasks in a web-based integrated development environment (IDE). Amazon EMR Studio lets you submit notebook code to EMR clusters deployed on EKS using the Studio interface. After seting up one or more managed endpoints to which Studio users can attach a Workspace, EMR Studio can communicate with your virtual cluster.
For EMR Studio preview, there is no additional cost when you create managed endpoints for virtual clusters. To learn more, visit the guide document.
Now Available Amazon EMR on Amazon EKS is available in US East (N. Virginia), US West (Oregon), and Europe (Ireland) Regions. You can run EMR workloads in AWS Fargate for EKS removing the need to provision and manage infrastructure for pods as a serverless option.
To learn more, visit the documentation. Please send feedback to the AWS forum for Amazon EMR or through your usual AWS support contacts.
November 2020 – Support for resource tagging, AWS PrivateLink, and manual qubit allocation. The first two features make it easy for you to connect your existing AWS applications to the new ones that you build with Amazon Braket, and should help you to envision what a production-class cloud-based quantum computing application will look like in the future. The last feature is particularly interesting to researchers; from what I understand, certain qubits within a given piece of quantum computing hardware can have individual physical and connectivity properties that might make them perform somewhat better when used as part of a quantum circuit. You can read about Allocating Qubits on QPU Devices to learn more (this is somewhat similar to the way that a compiler allocates CPU registers to frequently used variables).
As I write this, we are in the Noisy Intermediate Scale Quantum (NISQ) era. This description captures the state of the art in quantum computers: each gate in a quantum computing circuit introduces a certain amount of accuracy-destroying noise, and the cumulative effect of this noise imposes some practical limits on the scale of the problems.
Update Time We are working to address this challenge, as are many others in the quantum computing field. Today I would like to give you an update on what we are doing at the practical and the theoretical level.
Similar to the way that CPUs and GPUs work hand-in-hand to address large scale classical computing problems, the emerging field of hybrid quantum algorithms joins CPUs and QPUs to speed up specific calculations within a classical algorithm. This allows for shorter quantum executions that are less susceptible to the cumulative effects of noise and that run well on today’s devices.
Variational quantum algorithms are an important type of hybrid quantum algorithm. The classical code (in the CPU) iteratively adjusts the parameters of a parameterized quantum circuit, in a manner reminiscent of the way that a neural network is built by repeatedly processing batches of training data and adjusting the parameters based on the results of an objective function. The output of the objective function provides the classical code with guidance that helps to steer the process of tuning the parameters in the desired direction. Mathematically (I’m way past the edge of my comfort zone here), this is called differentiable quantum computing.
So, with this rather lengthy introduction, what are we doing?
First, we are making the PennyLane library available so that you can build hybrid quantum-classical algorithms and run them on Amazon Braket. This library lets you “follow the gradient” and write code to address problems in computational chemistry (by way of the included Q-Chem library), machine learning, and optimization. My AWS colleagues have been working with the PennyLane team to create an integrated experience when PennyLane is used together with Amazon Braket.
PennyLane is pre-installed in Braket notebooks and you can also install the Braket-PennyLane plugin in your IDE. Once you do this, you can train quantum circuits as you would train neural networks, while also making use of familiar machine learning libraries such as PyTorch and TensorFlow. When you use PennyLane on the managed simulators that are included in Amazon Braket, you can train your circuits up to 10 times faster by using parallel circuit execution.
Second, the AWS Center for Quantum Computing is working to address the noise issue in two different ways: we are investigating ways to make the gates themselves more accurate, while also working on the development of more efficient ways to encode information redundantly across multiple qubits. Our new paper, Building a Fault-Tolerant Quantum Computer Using Concatenated Cat Codes speaks to both of these efforts. While not light reading, the 100+ page paper proposes the construction of a 2-D grid of micron-scale electro-acoustic qubits that are coupled via superconducting circuits:
Interestingly, this proposed qubit design was used to model a Toffoli gate, and then tested via simulations that ran for 170 hours on c5.18xlarge instances. In a very real sense, the classical computers are being used to design and then simulate their future quantum companions.
The proposed hybrid electro-acoustic qubits are far smaller than what is available today, and also offer a > 10x reduction in overhead (measured in the number of physical qubits required per error-corrected qubit and the associated control lines). In addition to working on the experimental development of this architecture based around hybrid electro-acoustic qubits, the AWS CQC team will also continue to explore other promising alternatives for fault-tolerant quantum computing to bring new, more powerful computing resources to the world.
And Third, we are expanding the choice of managed simulators that are available on Amazon Braket. In addition to the state vector simulator (which can simulate up to 34 qubits), you can use the new tensor network simulator that can simulate up to 50 qubits for certain circuits. This simulator builds a graph representation of the quantum circuit and uses the graph to find an optimized way to process it.
Time to Learn It is still Day One (as we often say at Amazon) when it comes to quantum computing and now is the time to learn more and to get some experience with. Be sure to check out the Braket Tutorials repository and let me know what you think.
Amazon CodeGuru is a developer tool that helps you improve your code quality and has two main components:
CodeGuru Reviewer uses program analysis and machine learning to detect potential defects that are difficult to find in your code and offers suggestions for improvement.
CodeGuru Profiler collects runtime performance data from your live applications, and provides visualizations and recommendations to help you fine-tune your application performance.
Today, I am happy to announce three new features:
Python Support for CodeGuru Reviewer and Profiler (Preview) – You can now use CodeGuru to improve applications written in Python. Before this release, CodeGuru Reviewer could analyze Java code, and CodeGuru Profiler supported applications running on a Java virtual machine (JVM).
Security Detectors for CodeGuru Reviewer – A new set of detectors for CodeGuru Reviewer to identify security vulnerabilities and check for security best practices in your Java code.
Memory Profiling for CodeGuru Profiler – A new visualization of memory retention per object type over time. This makes it easier to find memory leaks and optimize how your application is using memory.
Let’s see these functionalities in more detail.
Python Support for CodeGuru Reviewer and Profiler (Preview) Python Support for CodeGuru Reviewer is available in Preview and offers recommendations on how to improve the Python code of your applications in multiple categories such as concurrency, data structures and control flow, scientific/math operations, error handling, using the standard library, and of course AWS best practices.
You can now also use CodeGuru Profiler to collect runtime performance data from your Python applications and get visualizations to help you identify how code is running on the CPU and where time is consumed. In this way, you can detect the most expensive lines of code of your application. Focusing your tuning activities on those parts helps you reduce infrastructure cost and improve application performance.
After that, I can get a code review from CodeGuru in two ways:
Automatically, when I create a pull request. This is a great way to use it as you and your team are working on a code base.
Manually, creating a repository analysis to get a code review for all the code in one branch. This is useful to start using GodeGuru with an existing code base.
Since I just associated the whole repository, I go for a full analysis and write down the branch name to review (apologies, I was still using master at the time, now I use main for new projects).
After a few minutes, the code review is completed, and there are 14 recommendations. Not bad, but I can definitely improve the code. Here’s a few of the recommendations I get. I was using exceptions and global variables too much at the time.
Security Detectors for CodeGuru Reviewer The new CodeGuru Reviewer Security Detector uses automated reasoning to analyze all code paths and find potential security issues deep in your Java code, even ones that span multiple methods and files and that may involve multiple sequences of operations. To build this detector, we used learning and best practices from Amazon’s 20+ years of experience.
If the security detector discovers an issue, it offers a suggested remediation along with an explanation. In this way, it’s much easier to follow security best practices for AWS APIs, such as those for AWS Key Management Service (KMS) and Amazon Elastic Compute Cloud (EC2), and for common Java cryptography and TLS/SSL libraries.
With help from the security detector, security engineers can focus on architectural and application-specific security best-practices, and code reviewers can focus their attention on other improvements.
Memory Profiling for CodeGuru Profiler For applications running on a JVM, CodeGuru Profiler can now show the Heap Summary, a consolidated view of memory retention during a time frame, tracking both overall sizes and number of objects per object type (such as String, int, char, and custom types). These metrics are presented in a timeline graph, so that you can easily spot trends and peaks of memory utilization per object type.
Here are a couple of scenarios where this can help:
Memory Leaks – A constantly growing memory utilization curve for one or more object types may indicate a leak (intended here as unnecessary retention of memory objects by the application), possibly leading to out-of-memory errors and application crashes.
Memory Optimizations – Having a breakdown of memory utilization per object type is a step beyond traditional memory utilization monitoring, based solely on JVM-level metrics like total heap usage. By knowing that an unexpectedly high amount of memory has been associated with a specific object type, you can focus your analysis and optimization efforts on the parts of your application that are responsible for allocating and referencing objects of that type.
For example, here is a graph showing how memory is used by a Java application over an interval of time. Apart from the total capacity available and the used space, I can see how memory is being used by some specific object types, such as byte, java.lang.UUID, and the entries of a java.util.LinkedHashMap. The continuous growth over time of the memory retained by these object types is suspicious. There is probably a memory leak I have to investigate.
In the table just below, I have a longer list of object types allocating memory on the heap. The first three are selected and for that reason are shown in the graph above. Here, I can inspect other object types and select them to see their memory usage over time. It looks like the three I already selected are the ones with more risk of being affected by a memory leak.
Gathering evidence in a timely manner to support an audit can be a significant challenge due to manual, error-prone, and sometimes, distributed processes. If your business is subject to compliance requirements, preparing for an audit can cause significant lost productivity and disruption as a result. You might also have trouble applying traditional audit practices, which were originally designed for legacy on-premises systems, to your cloud infrastructure.
To satisfy complex and evolving sets of regulation and compliance standards, including the General Data Protection Regulation (GDPR), Health Insurance Portability and Accountability Act (HIPAA), and Payment Card Industry Data Security Standard (PCI DSS), you’ll need to gather, verify, and synthesize evidence.
You’ll also need to constantly reevaluate how your AWS usage maps to those evolving compliance control requirements. To satisfy requirements you may need to show data encryption was active, and log files showing server configuration changes, diagrams showing application high availability, transcripts showing required training was completed, spreadsheets showing that software usage did not exceed licensed amounts, and more. This effort, sometimes involving dozens of staff and consultants, can last several weeks.
Available today, AWS Audit Manager is a fully managed service that provides prebuilt frameworks for common industry standards and regulations, and automates the continual collection of evidence to help you in preparing for an audit. Continuous and automated gathering of evidence related to your AWS resource usage helps simplify risk assessment and compliance with regulations and industry standards and helps you maintain a continuous, audit-ready posture to provide a faster, less disruptive preparation process.
Built-in and customizable frameworks map usage of your cloud resources to controls for different compliance standards, translating evidence into an audit-ready, immutable assessment report using auditor-friendly terminology. You can also search, filter, and upload additional evidence to include in the final assessment, such as details of on-premises infrastructure, or procedures such as business continuity plans, training transcripts, and policy documents.
Given that audit preparation typically involves multiple teams, a delegation workflow feature lets you assign controls to subject-matter experts for review. For example, you might delegate reviewing evidence of network security to a network security engineer.
The finalized assessment report includes summary statistics and a folder containing all the evidence files, organized in accordance with the exact structure of the associated compliance framework. With the evidence collected and organized into a single location, it’s ready for immediate review, making it easier for audit teams to verify the evidence, answer questions, and add remediation plans.
Getting started with Audit Manager Let’s get started by creating and configuring a new assessment. From Audit Manager‘s console home page, clicking Launch AWS Audit Manager takes me to my Assessments list (I can also reach here from the navigation toolbar to the left of the console home). There, I click Create assessment to start a wizard that walks me through the settings for the new assessment. First, I give my assessment a name, optional description, and then specify an Amazon Simple Storage Service (S3) bucket where the reports associated with the assessment will be stored.
Next, I choose the framework for my assessment. I can select from a variety of prebuilt frameworks, or a custom framework I have created myself. Custom frameworks can be created from scratch or based on an existing framework. Here, I’m going to use the prebuilt PCI DSS framework.
After clicking Next, I can select the AWS accounts to be included in my assessment (Audit Manager is also integrated with AWS Organizations). Since I have a single account, I select it and click Next, moving on to select the AWS services that I want to be included in evidence gathering. I’m going to include all the suggested services (the default) and click Next to continue.
Next I need to select the owners of the assessment, who have full permission to manage it (owners can be AWS Identity and Access Management (IAM) users or roles). You must select at least one owner, so I select my account and click Next to move to the final Review and create page. Finally, clicking Create assessment starts the gathering of evidence for my new assessment. This can take a while to complete, so I’m going to switch to another assessment to examine what kinds of evidence I can view and choose to include in my assessment report.
Back in the Assessments list view, clicking on the assessment name takes me to details of the assessment, a summary of the controls for which evidence is being collected, and a list of the control sets into which the controls are grouped. Total evidence tells me the number of events and supporting documents that are included in the assessment. The additional tabs can be used to give me insight into the evidence I select for the final report, which accounts and services are included in the assessment, who owns it, and more. I can also navigate to the S3 bucket in which the evidence is being collected.
Expanding a control set shows me the related controls, with links to dive deeper on a given control, together with the status (Under review, Reviewed, and Inactive), whom the control has been delegated to for review, the amount of evidence gathered for that control, and whether the control and evidence have been added to the final report. If I change a control to be Inactive, meaning automated evidence gathering will cease for that control, this is logged.
Let’s take a closer look at a control to show how the automated evidence gathering can help identify compliance issues before I start compiling the audit report. Expanding Default control set, I click control 8.1.2 For a sample of privileged user IDs… which takes me to a view giving more detailed information on the control and how it is tested. Scrolling down, there is a set of evidence folders listed and here I notice that there are some issues. Clicking the issue link in the Compliance check column summarizes where the data came from. Here, I can also select the evidence that I want included in my final report.
Going further, I can click on the evidence folder to note that there was a failure, and in turn clicking on the time of the failure takes me to a detailed summary of the issues for this control, and how to remediate.
With the evidence gathered, it’s a simple task to select sufficient controls and appropriate evidence to include in my assessment report that can then be passed to my auditors. For the purposes of this post I’ve gone ahead and selected evidence for a handful of controls into my report. Then, I selected the Assessment report selection tab, where I review my evidence selections, and clicked Generate assessment report. In the dialog that appeared I gave my report a name, and then clicked Generate assessment report. When the dialog closes I am taken to the Assessment reports view and, when my report is ready, I can select it and download a zip file containing the report and the selected evidence. Alternatively, I can open the S3 bucket associated with the assessment (from the assessment’s details page) and view the report details and evidence there, as shown in the screenshot below. The overall report is listed (as a PDF file) and if I drill into the evidence folders, I can also view PDF files related to the specific items of evidence I selected.
And to close, below is a screenshot of the beginning of the assessment report PDF file showing the number of selected controls and evidence, and services that I selected to be in scope when I created the assessment. Further pages go into more details.
Audit Manager is available today in 10 AWS Regions: US East (Northern Virginia, Ohio), US West (Northern California, Oregon), Asia Pacific (Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London).
Today, I’m extremely happy to announce the availability of Amazon SageMaker JumpStart, a capability of Amazon SageMaker that accelerates your machine learning workflows with one-click access to popular model collections (also known as “model zoos”), and to end-to-end solutions that solve common use cases.
In recent years, machine learning (ML) has proven to be a valuable technique in improving and automating business processes. Indeed, models trained on historical data can accurately predict outcomes across a wide range of industry segments: financial services, retail, manufacturing, telecom, life sciences, and so on. Yet, working with these models requires skills and experience that only a subset of scientists and developers have: preparing a dataset, selecting an algorithm, training a model, optimizing its accuracy, deploying it in production, and monitoring its performance over time.
In order to simplify the model building process, the ML community has created model zoos, that is to say, collections of models built with popular open source libraries, and often pretrained on reference datasets. For example, the TensorFlow Hub and the PyTorch Hub provide developers with a long list of models ready to be downloaded, and integrated in applications for computer vision, natural language processing, and more.
Still, downloading a model is just part of the answer. Developers then need to deploy it for evaluation and testing, using either a variety of tools, such as the TensorFlow Serving and TorchServe model servers, or their own bespoke code. Once the model is running, developers need to figure out the correct format that incoming data should have, a long-lasting pain point. I’m sure I’m not the only one regularly pulling my hair out here!
Of course, a full-ML application usually has a lot of moving parts. Data needs to be preprocessed, enriched with additional data fetched from a backend, and funneled into the model. Predictions are often postprocessed, and stored for further analysis and visualization. As useful as they are, model zoos only help with the modeling part. Developers still have lots of extra work to deliver a complete ML solution.
Because of all this, ML experts are flooded with a long backlog of projects waiting to start. Meanwhile, less experienced practitioners struggle to get started. These barriers are incredibly frustrating, and our customers asked us to remove them.
SageMaker JumpStart also provides notebooks, blogs, and video tutorials designed to help you learn and remove roadblocks. Content is easily accessible within Amazon SageMaker Studio, enabling you to get started with ML faster.
It only takes a single click to deploy solutions and models. All infrastructure is fully managed, so all you have to do is enjoy a nice cup of tea or coffee while deployment takes place. After a few minutes, you can start testing, thanks to notebooks and sample prediction code that are readily available in Amazon SageMaker Studio. Of course, you can easily modify them to use your own data.
SageMaker JumpStart makes it extremely easy for experienced practitioners and beginners alike to quickly deploy and evaluate models and solutions, saving days or even weeks of work. By drastically shortening the path from experimentation to production, SageMaker JumpStart accelerates ML-powered innovation, particularly for organizations and teams that are early on their ML journey, and haven’t yet accumulated a lot of skills and experience.
Deploying a Solution with Amazon SageMaker JumpStart Opening SageMaker Studio, I select the “JumpStart” icon on the left. This opens a new tab showing me all available content (solutions, models, and so on).
Let’s say that I’m interested in using computer vision to detect defects in manufactured products. Could ML be the answer?
Browsing the list of available solutions, I see one for product defect detection.
Opening it, I can learn more about the type of problems that it solves, the sample dataset used in the demo, the AWS services involved, and more.
A single click is all it takes to deploy this solution. Under the hood, AWS CloudFormation uses a built-in template to provision all appropriate AWS resources.
A few minutes later, the solution is deployed, and I can open its notebook.
The notebook opens immediately in SageMaker Studio. I run the demo, and understand how ML can help me detect product defects. This is also a nice starting point for my own project, making it easy to experiment with my own dataset (feel free to click on the image below to zoom in).
Once I’m done with this solution, I can delete all its resources in one click, letting AWS CloudFormation clean up without having to worry about leaving idle AWS resources behind.
Now, let’s look at models.
Deploying a Model with Amazon SageMaker JumpStart SageMaker JumpStart includes a large collection of models available in the TensorFlow Hub and the PyTorch Hub. These models are pre-trained on reference datasets, and you can use them directly to handle a wide range of computer vision and natural language processing tasks. You can also fine-tune them on your own datasets for greater accuracy, a technique called transfer learning.
Here, I pick a version of the BERT model trained on question answering. I can either deploy it as is, or fine-tune it. For the sake of brevity, I go with the former here, and I just click on the “Deploy” button.
A few minutes later, the model has been deployed to a real-time endpoint powered by fully managed infrastructure.
Time to test it! Clicking on “Open Notebook” launches a sample notebook that I run right away to test the model, without having to change a line of code (again, feel free to click on the image below to zoom in). Here, I’m asking two questions (“What is Southern California often abbreviated as?” and “Who directed Spectre?“), passing some context containing the answer. In both cases, the BERT model gives the correct answer, respectively “socal” and “Sam Mendes“.
When I’m done testing, I can delete the endpoint in one click, and stop paying for it.
Getting Started As you can see, it’s extremely easy to deploy models and solutions with SageMaker JumpStart in minutes, even if you have little or no ML skills.
You can start using this capability today in all regions where SageMaker Studio is available, at no additional cost.
Today, I’m extremely happy to announce Amazon SageMaker Pipelines, a new capability of Amazon SageMaker that makes it easy for data scientists and engineers to build, automate, and scale end to end machine learning pipelines.
Machine learning (ML) is intrinsically experimental and unpredictable in nature. You spend days or weeks exploring and processing data in many different ways, trying to crack the geode open to reveal its precious gemstones. Then, you experiment with different algorithms and parameters, training and optimizing lots of models in search of highest accuracy. This process typically involves lots of different steps with dependencies between them, and managing it manually can become quite complex. In particular, tracking model lineage can be difficult, hampering auditability and governance. Finally, you deploy your top models, and you evaluate them against your reference test sets. Finally? Not quite, as you’ll certainly iterate again and again, either to try out new ideas, or simply to periodically retrain your models on new data.
No matter how exciting ML is, it does unfortunately involve a lot of repetitive work. Even small projects will require hundreds of steps before they get the green light for production. Over time, not only does this work detract from the fun and excitement of your projects, it also creates ample room for oversight and human error.
To alleviate manual work and improve traceability, many ML teams have adopted the DevOps philosophy and implemented tools and processes for Continuous Integration and Continuous Delivery (CI/CD). Although this is certainly a step in the right direction, writing your own tools often leads to complex projects that require more software engineering and infrastructure work than you initially anticipated. Valuable time and resources are diverted from the actual ML project, and innovation slows down. Sadly, some teams decide to revert to manual work, for model management, approval, and deployment.
Introducing Amazon SageMaker Pipelines Simply put, Amazon SageMaker Pipelines brings in best-in-class DevOps practices to your ML projects. This new capability makes it easy for data scientists and ML developers to create automated and reliable end-to-end ML pipelines. As usual with SageMaker, all infrastructure is fully managed, and doesn’t require any work on your side.
Care.com is the world’s leading platform for finding and managing high-quality family care. Here’s what Clemens Tummeltshammer, Data Science Manager, Care.com, told us: “A strong care industry where supply matches demand is essential for economic growth from the individual family up to the nation’s GDP. We’re excited about Amazon SageMaker Feature Store and Amazon SageMaker Pipelines, as we believe they will help us scale better across our data science and development teams, by using a consistent set of curated data that we can use to build scalable end-to-end machine learning (ML) model pipelines from data preparation to deployment. With the newly announced capabilities of Amazon SageMaker, we can accelerate development and deployment of our ML models for different applications, helping our customers make better informed decisions through faster real-time recommendations.”
Once launched, model building pipelines are executed as CI/CD pipelines. Every step is recorded, and detailed logging information is available for traceability and debugging purposes. Of course, you can also visualize pipelines in Amazon SageMaker Studio, and track their different executions in real time.
Model Registry – The model registry lets you track and catalog your models. In SageMaker Studio, you can easily view model history, list and compare versions, and track metadata such as model evaluation metrics. You can also define which versions may or may not be deployed in production. In fact, you can even build pipelines that automatically trigger model deployment once approval has been given. You’ll find that the model registry is very useful in tracing model lineage, improving model governance, and strengthening your compliance posture.
MLOps Templates – SageMaker Pipelines includes a collection of built-in CI/CD templates for popular pipelines (build/train/deploy, deploy only, and so on). You can also add and publish your own templates, so that your teams can easily discover them and deploy them. Not only do templates save lots of time, they also make it easy for ML teams to collaborate from experimentation to deployment, using standard processes and without having to manage any infrastructure. Templates also let Ops teams customize steps as needed, and give them full visibility for troubleshooting.
Now, let’s do a quick demo!
Building an End-to-end Pipeline with Amazon SageMaker Pipelines Opening SageMaker Studio, I select the “Components” tab and the “Projects” view. This displays a list of built-in project templates. I pick one to build, train, and deploy a model.
Then, I simply give my project a name, and create it.
A few seconds later, the project is ready. I can see that it includes two Git repositories hosted in AWS CodeCommit, one for model training, and one for model deployment.
The first repository provides scaffolding code to create a multi-step model building pipeline: data processing, model training, model evaluation, and conditional model registration based on accuracy. As you’ll see in the pipeline.py file, this pipeline trains a linear regression model using the XGBoost algorithm on the well-known Abalone dataset. This repository also includes a build specification file, used by AWS CodePipeline and AWS CodeBuild to execute the pipeline automatically.
Likewise, the second repository contains code and configuration files for model deployment, as well as test scripts required to pass the quality gate. This operation is also based on AWS CodePipeline and AWS CodeBuild, which run a AWS CloudFormation template to create model endpoints for staging and production.
Clicking on the two blue links, I clone the repositories locally. This triggers the first execution of the pipeline.
A few minutes later, the pipeline has run successfully. Switching to the “Pipelines” view, I can visualize its steps.
Clicking on the training step, I can see the Root Mean Square Error (RMSE) metrics for my model.
As the RMSE is lower than the threshold defined in the conditional step, my model is added to the model registry, as visible below.
For simplicity, the registration step sets the model status to “Approved”, which automatically triggers its deployment to a real-time endpoint in the same account. Within seconds, I see that the model is being deployed.
Alternatively, you could register your model with a “Pending manual approval” status. This will block deployment until the model has been reviewed and approved manually. As the model registry supports cross-account deployment, you could also easily deploy in a different account, without having to copy anything across accounts.
A few minutes later, the endpoint is up, and I could use it to test my model.
Once I’ve made sure that this model works as expected, I could ping the MLOps team, and ask them to deploy the model in production.
Putting my MLOps hat on, I open the AWS CodePipeline console, and I see that my deployment is indeed waiting for approval.
I then approve the model for deployment, which triggers the final stage of the pipeline.
Reverting to my Data Scientist hat, I see in SageMaker Studio that my model is being deployed. Job done!
Getting Started As you can see, Amazon SageMaker Pipelines makes it really easy for Data Science and MLOps teams to collaborate using familiar tools. They can create and execute robust, automated ML pipelines that deliver high quality models in production quicker than before.
You can start using SageMaker Pipelines in all commercial regions where SageMaker is available. The MLOps capabilities are available in the regions where CodePipeline is also available.
Sample notebooks are available to get you started. Give them a try, and let us know what you think. We’re always looking forward to your feedback, either through your usual AWS support contacts, or on the AWS Forum for SageMaker.
Special thanks to my colleague Urvashi Chowdhary for her precious assistance during early testing.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.