Tag Archives: Internet of Things

Detect Real-Time Anomalies and Failures in Industrial Processes Using Apache Flink

Post Syndicated from Hubert Asamer original https://aws.amazon.com/blogs/architecture/detect-real-time-anomalies-and-failures-in-industrial-processes-using-apache-flink/

For a long time, industrial control systems were the heart of the manufacturing process which allows collecting, processing, and acting on data from the shop floor. Process manufacturers used a distributed control system (DCS) to do the automated control and operation of an industrial process or plant.

With the convergence of operational technology and information technology (IT), customers such as Yara are integrating their DCS with additional intelligence from the IT side. This provides customers with a holistic view of the different data sources to make more complex decisions with advanced analytics.

In this blog post, we show how to start with advanced analytics on streaming data coming from the shop floor. The sensor data, such as pressure and temperature, is typically published by a DCS. It is then ingested with a local edge gateway and streamed to the cloud with streaming and industrial internet of things (IoT) technology. Analytics on the streaming data is typically done before all data points are stored in the data layer. Figure 1 shows how the data flow can be modeled and visualized with AWS services.

Figure 1: High-level ingestion and analytics architecture

Figure 1: High-level ingestion and analytics architecture

In this example, we are concentrating on the streaming analytics part in the Cloud. We will generate data from a simulated DCS to Amazon Kinesis Data Streams where you have a gateway such as AWS IoT Greengrass and maybe other IoT services in-between.

For the simulated process that the DCS is controlling, we use a well-documented industrial process for creating a chemical compound (acetic anhydride) called the Tennesee Eastman process (TEP). There are several simulations available as open source. We demonstrate how to use this data as a constant stream with more than 30 real-time measurement parameters, ingest to Kinesis Data Streams, and run in-stream analytics using Apache Flink. Within Apache Flink, data is grouped and mapped to the respective stages and parts of the industrial process, and constantly analyzed by calculating anomalies of all process stages. All raw data, plus the derived anomalies and failure patterns, are then ingested from Apache Flink to Amazon Timestream for further use in near real-time dashboards.

Overview of solution

Note: Refer to steps 1 to 6 in Figure 2.

As a starting point for a realistic and data intensive measurement source, we use an already existing (TEP) simulation framework written in C++ originally created from National Institute of Standards and Technology, and published as open source. The GitHub Blog repository contains a small patch which adds AWS connectivity with the software development kits (SDKs) and modifications to the command line arguments. The programs provided by this framework are (step 1) a simulation process starter with configurable starting conditions and timestep configurations and a real-time client (step 2) which connects to the simulation and sends the simulation output data to the AWS Cloud.

Tennesee Eastman process (TEP) background

A paper by Downs & Vogel, A plant-wide industrial process control problem, from 1991 states:

“This chemical standard process consists of a reactor/separator/recycle arrangement involving two simultaneous gas-liquid exothermic reactions.”

“The process produces two liquid products from four reactants. Also present are an inert and a byproduct making a total of eight components. Two additional byproduct reactions also occur. The process has 12 valves available for manipulation and 41 measurements available for monitoring or control.“

The simulation framework used can control all of the 12 valve settings and produces 41 measurement variables with varying sampling frequency.

Data ingestion

The 41 measurement variables, named xmeas_1 to xmeas_41, are emitted by the real-time client (step 2) as key-value JSON messages. The client code is configured to produce 100 messages per second. A built-in C++ Kinesis SDK allows the real-time client to directly stream JSON messages to a Kinesis data stream (step 3).

Figure 2: Detailed system architecture

Figure 2 – Detailed system architecture

Stream processing with Apache Flink

Messages sent to Amazon Kinesis Data Stream are processed in configurable batch sizes by an Apache Flink application, deployed in Amazon Kinesis Data Analytics. Apache Flink is an open-source stream processing framework, written and usable in Java or Scala. As described in Figure 3, it allows the definition of various data sources (for example, a Kinesis data stream) and data sinks for storing processing results. In-between data can be processed by a range of operators—typically mapping and reducing functions (step 4).

In our case, we use a mapping operator where each batch of incoming messages is processed. In Code snippet 1, we apply a custom mapping function to the raw data stream. For rapid and iterative development purposes it’s possible to have the complete stream processing pipeline running in a local Java or Scala IDE such as Maven, Eclipse, or IntelliJ.

Figure 3: Flink execution plan (green: streaming data sources; yellow: data sinks)

Figure 3: Flink execution plan (green: streaming data sources; yellow: data sinks)

public class StreamingJob extends AnomalyDetector {
---
  public static DataStream<String> createKinesisSource
    (StreamExecutionEnvironment env, 
     ParameterTool parameter)
    {
    // create Stream
    return kinesisStream;
  }
---
  public static void main(String[] args) {
    // set up the execution environment
    final StreamExecutionEnvironment env = 
      StreamExecutionEnvironment.getExecutionEnvironment();
---
    DataStream<List<TimestreamPoint>> mainStream =
      createKinesisSource(env, parameter)
      .map(new AnomalyJsonToTimestreamPayloadFn(parameter))
      .name("MaptoTimestreamPayload");
---
    env.execute("Amazon Timestream Flink Anomaly Detection Sink");
  }
}

Code snippet 1: Flink application main class

In-stream anomaly detection

Within the Flink mapping operator, a statistical outlier detection (anomaly detection) is implemented. Flink allows the inclusion of custom libraries within its operators. The library used here is published by AWS—a Random Cut Forest implementation available from GitHub. Random Cut Forest is a well understood statistical method which can operate on batches of measurements. It then calculates an anomaly score for each new measurement by comparing a new value with a cached pool (=forest) of older values.

The algorithm allows the creation of grouped anomaly scores, where a set of variables is combined to calculate a single anomaly score. In the simulated chemical process (TEP), we can group the measurement variables into three process stages:

  1. reactor feed analysis
  2. purge gas analysis
  3. product analysis.

Each group consists of 5–10 measurement variables. We’re getting anomaly scores for a, b, and c. In Code snippet 2 we can learn how an anomaly detector is created. The class AnomalyDetector is instantiated and extended then three times (for our three distinct process stages) within the mapping function as described in Code snippet 3.

Flink distributes this calculation across its worker nodes and handles data deduplication processes within its system.

---
public class AnomalyDetector {
    protected final ParameterTool parameter;
    protected final Function<RandomCutForest, LineTransformer> algorithmInitializer;
    protected LineTransformer algorithm;
    protected ShingleBuilder shingleBuilder;
    protected double[] pointBuffer;
    protected double[] shingleBuffer;
    public AnomalyDetector(
      ParameterTool parameter,
      Function<RandomCutForest,LineTransformer> algorithmInitializer)
    {
      this.parameter = parameter;
      this.algorithmInitializer = algorithmInitializer;
    }
    public List<String> run(Double[] values) {
            if (pointBuffer == null) {
                prepareAlgorithm(values.length);
            }
      return processLine(values);
    }
    protected void prepareAlgorithm(int dimensions) {
---
      RandomCutForest forest = RandomCutForest.builder()
        .numberOfTrees(Integer.parseInt(
          parameter.get("RcfNumberOfTrees", "50")))
        .sampleSize(Integer.parseInt(
          parameter.get("RcfSampleSize", "8192")))
        .dimensions(shingleBuilder.getShingledPointSize())
        .lambda(Double.parseDouble(
          parameter.get("RcfLambda", "0.00001220703125")))
        .randomSeed(Integer.parseInt(
          parameter.get("RcfRandomSeed", "42")))
      .build();
---
    algorithm = algorithmInitializer.apply(forest);
  }

Code snippet 2: AnomalyDetector base class, which gets extended by the streaming applications main class

public class AnomalyJsonToTimestreamPayloadFn extends 
    RichMapFunction<String, List<TimestreamPoint>> {
  protected final ParameterTool parameter;
  private final Logger logger = 

  public AnomalyJsonToTimestreamPayloadFn(ParameterTool parameter) {
    this.parameter = parameter;
  }

  // create new instance of StreamingJob for running our Forest
  StreamingJob overallAnomalyRunner1;
  StreamingJob overallAnomalyRunner2;
  StreamingJob overallAnomalyRunner3;
---

  // use `open`method as RCF initialization
  @Override
  public void open(Configuration parameters) throws Exception {
    overallAnomalyRunner1 = new StreamingJob(parameter);
    overallAnomalyRunner2 = new StreamingJob(parameter);
    overallAnomalyRunner3 = new StreamingJob(parameter);
  super.open(parameters);
}
---

Code snippet 3: Mapping Function uses the Flink RichMapFunction open routine to initialize three distinct Random Cut Forests

Data persistence – Flink data sinks

After all anomalies are calculated, we can decide where to send this data. Flink provides various ready-to-use data sinks. In these examples, we fan out all (raw and processed) data to Amazon Kinesis Data Firehose for storing in Amazon Simple Storage Service (Amazon S3) (long term) (step 5) and to Amazon Timestream (short term) (step 5). Kinesis Data Firehose is configured with a small AWS Lambda function to reformat data from JSON to CSV, and data is stored with automated partitioning to Amazon S3. A Timestream data sink does not come pre-bundled with Flink. A custom Timestream ingestion code is used in these examples. Flink provides extensible operator interfaces for the creation of custom map and sink functions.

Timeseries handling

Timestream, in combination with Grafana, is used for near real-time monitoring. Grafana comes bundled with a Timestream data source plugin and can constantly query and visualize Timestream data (step 6).

Walkthrough

Our architecture is available as a deployable AWS CloudFormation template. The simulation framework comes packed as a docker image, with an option to install it locally on a linux host.

Prerequisites

To implement this architecture, you will need:

  • An AWS account
  • Docker (CE) Engine v18++
  • Java JDK v11++
  • maven v3.6++

We recommend running a local and recent Linux environment. It is assumed that you are using AWS Cloud9, deployed with CloudFormation, within your AWS account.

Steps

Follow these steps to deploy the solution and play with the simulation framework. At the end, detected anomalies derived from Flink are stored next to all raw data in Timestream and presented in Grafana. We’re using AWS Cloud9 and its Linux terminal capabilities here to fire up a Grafana instance, then manually run the simulation to ingest data to Kinesis and optionally manually start the Flink app from the console using Maven.

Deploy stack

After you’re logged in to the AWS Management console you can deploy the CloudFormation stack. This stack creates a fully configured AWS Cloud9 environment with the related GitHub Repo already in place, a Kinesis data stream, Kinesis Data Firehose delivery stream, Kinesis Data Analytics with Flink app deployed, Timestream database, and an S3 bucket.

launch stack button

After successful deployment, record two important facts from the CloudFormation console: the chosen stack name and the attribute 03Cloud9EnvUrl displayed in the Output Section of the stack. The attribute’s URL will take you directly to our deployed AWS Cloud9 environment.

Run post install step within AWS Cloud9

The deployed stack created an AWS Cloud9 environment and an AWS Identity and Access Management (IAM) instance profile. We apply this instance profile to AWS Cloud9 to interact with Kinesis, Timestream, and Amazon S3 throughout the next steps. The used script also configures and installs other required tools.

1.       Open a terminal window.

$ cd flinkAnomalySteps/deployment
$ source c9-postInstall.sh
---SETTING UP IAM INSTANCE PROFILE
Please enter cloudformation stack name (default: flink-rcf-app):
# enter your stack name

Start a Grafana development server

In this section we are starting a Grafana server using docker. Cloud 9 allows us to expose web applications (for demo & development purposes) on container port 8080.

1.       Open a terminal window.

$ cd ../src/grafana-dashboard
$ docker volume create grafana-storage
# this creates a docker volume for persisting your Grafana settings
$./start-grafana.sh
# this starts a recent Grafana using docker with Timestream plugin and a pre-configured dashboard in place

2.       Open the preview panel by selecting Preview, and then select Preview Running Application.

Cloud9 screenshot

3.       Next, in the preview pane, select Pop out into new Window.

Cloud9 screenshot2

4.       A new browser tab with Grafana opens.

5.       Choose any username and password combination.

6.       In Grafana use the “Search dashboards” icon on the left and choose “TEP-SIM-DEV”. This pre-configured dashboard displays data from Amazon Timestream (see step “Open Grafana”).

TEP simulation procedure

Within your local or AWS Cloud9 Linux environment, fetch the simulation docker image from the public AWS container registry, or build the simulation binaries locally, for building manually check the GitHub repo.

Start simulation (in separate terminal)

# starts container and switch into the container-shell
$ docker run -it --rm \
  --network host \
  --name tesim-runner \
  tesim-runner:01 \
 /bin/bash
# then inside container
$ ./tesim --simtime 100  --external-ctrl
# simulation started…

Manipulate simulation (in separate terminal)

Follow the steps here for a basic process disturbance task. Review the aspects of influencing the simulation in the GitHub-Repo. The rtclient program has a range of commands to use for introducing disturbances.

# first switch into running simulation container
$ docker exec -it tesim-runner /bin/bash
# now we can access the shared storage of the simulation process…
$ ./rtclient –setidv 6
# this enables one of the built in process disturbances (1-20)
$ ./rtclient –setidv 7
$ ./rtclient –setidv 8
$ …

Stream Simulation data to Amazon Kinesis DataStream (in separate terminal)

The client has a built-in record frequency of 50 messages per second. One message contains more than 50 measurements, so we have approximately 2,500 measurements per second.

$ ./rtclient -k

AWS libcrypto resolve: found static libcrypto 1.1.1 HMAC symbols
AWS libcrypto resolve: found static libcrypto 1.1.1 EVP_MD symbols
{"xmeas_1": 3649.739476,"xmeas_2": 4451.32071,"xmeas_3": 9.223142558,"xmeas_4": 32.39290913,"xmeas_5": 47.55975621,"xmeas_6": 2798.975688,"xmeas_7": 64.99582601,"xmeas_8": 122.8987929,"xmeas_9": 0.1978264656,…}
# Messages in JSON sent to Kinesis DataStream visible via stdout

Compile and start Flink Application (optional step)

If you want deeper insights into the Flink Application, we can start this as well from the AWS Cloud9 instance. Note: this is only appropriate in development.

$ cd flinkAnomalySteps/src
$ cd flink-rcf-app
$ mvn clean compile
# the Flink app gets compiled
$ mvn exec:java -Dexec.mainClass= \
    "com.amazonaws.services.kinesisanalytics.StreamingJob"
# Flink App is started with default settings in place…
…

Open Grafana dashboard (from the step Start a Grafana development server)

Process anomalies are visible instantly after you start the simulation. Use Grafana to drill down into the data as needed.

/**example - simplest possible Timestream query used for Viz:**/

SELECT CREATE_TIME_SERIES(time, measure_value::double) as anomaly_stream6 FROM "kdaflink"."kinesisdata1"

    WHERE measure_name='anomaly_score_stream6' AND

    time between ago(15m) and now()

Code snippet 4: Timestream SQL example; Timestream database is `kdaflink` – table is `kinesisdata1`

Figure 4 - Grafana dashboard showing near real-time simulation data

Figure 4 – Grafana dashboard showing near real-time simulation data, three anomalies, mapped to the TEP process, are constantly calculated by Flink

S3 raw metrics bucket

For the sake of completeness and potential usefulness, the Flink Application emits all raw data in an intermediate step to Kinesis Data Firehose. The service converts all JSON data to CSV format by using a small AWS Lambda function.

$ aws s3 ls flink-rcf-app-rawmetricsbucket-<CFN-UUID>/tep-raw-csv/2021/11/26/19/

Cleaning up

Delete the deployed CloudFormation stack. All resources (excluding S3 buckets) are permanently deleted.

Conclusion

In this blog post, we learned that in-stream anomaly detection and constant measurement data insights can work together. The Apache Flink framework offers a ready-to-use platform that is mission critical for future adoption across manufacturing and other industries. Other applications of the presented Flink pattern can run on capable edge compute devices. Integration with AWS IoT Greengrass and AWS Greengrass Stream Manager are part of the GitHub Blog repository.

Another extension includes measurement data pattern detection routines, which can coexist with in-stream anomaly detection and can detect specific failure patterns over time using time-windowing features of the Flink framework. You can refer to the GitHub repo which accompanies this blog post. Give it a try and let us know your feedback in the comments!

New – FreeRTOS Extended Maintenance Plan for Up to 10 Years

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-freertos-extended-maintenance-plan-for-up-to-10-years/

Last AWS re:Invent 2020, we announced FreeRTOS Long Term Support (LTS) that offers a more stable foundation than standard releases, as manufacturers deploy and later update devices in the field. FreeRTOS is an open source, real-time operating system for microcontrollers that makes small, low-power edge devices easy to program, deploy, secure, connect, and manage.

In 2021, FreeRTOS LTS released 202012.01 to include AWS IoT Over-the-Air (OTA) update, AWS IoT Device Defender, and AWS IoT Jobs libraries that provides feature stability, security patches, and critical bug fixes for the next two years.

Today, I am happy to announce FreeRTOS Extended Maintenance Plan (EMP), which allows embedded developers to receive critical bug fixes and security patches on their chosen FreeRTOS LTS version for up to 10 years beyond the expiry of the initial LTS period. FreeRTOS EMP lets developers improve device security (or helps keep devices secure) for years, save on operating system upgrade costs, and reduce the risks associated with patching their devices.

FreeRTOS EMP applies to libraries covered by FreeRTOS LTS. Therefore, developers have device lifecycles longer than the LTS period of 2 years and can continue using a version that provides feature stability, security patches, and critical bug fixes, all without having to plan a costly version upgrade.

Here are main features of FreeRTOS EMP:

Features Description Why is it important?
Feature stability Get FreeRTOS libraries that maintain the same set of features for years Save upgrade costs by using a stable FreeRTOS codebase for their product lifecycle
API stability Get FreeRTOS libraries that have stable APIs for years
Critical fixes Receive security patches and critical bug* fixes on your chosen FreeRTOS libraries Security patches help keep their IoT devices secure for the product lifecycle
Notification of patches Receive timely notification upcoming patches Timely awareness of security patches helps proactively plan the deployment of patches
Flexible subscription plan Extend maintenance by a year or longer Continue to renew their annual subscription for a longer period to keep the same version for the entire device lifecycle, or for a shorter period to buy time before upgrading to the latest FreeRTOS version.

* A critical bug is a defect determined by AWS to impact the functionality of the affected library and has no reasonable workaround.

Getting Started with FreeRTOS EMP
To get started, subscribe to the plan using your AWS account, and renew the subscription annually or for a longer period to either cover their product lifecycle or until you are ready to transition to a new FreeRTOS LTS release.

Before the end of the current LTS period, you will be able to use your AWS account to complete the FreeRTOS EMP registration on the FreeRTOS console, review and agree to the associated terms and conditions, select the LTS version, and buy an annual subscription. You will then gain access to the private repository where you’ll receive .zip files containing a git repo with chosen libraries, patches, and related notifications.

Under NDA, AWS will notify you via official AWS Security channels of an upcoming patch and its timelines (if AWS is reasonably able to do so and deems it appropriate). Patches will be sent to your private repository within three business days of successfully implementing and getting AWS Security approval for our mitigation.

AWS will provide technical support for FreeRTOS EMP customers via separate subscriptions to AWS Support. AWS Support is not included in FreeRTOS EMP subscriptions. You can track issues such as AWS accounts, billing, and bugs, or get access to technical experts such as patch integration issues based on your AWS Support plan.

Available Now
FreeRTOS EMP will be available for the current and all previous FreeRTOS LTS releases. Subscriptions can be renewed annually for up to 10 years from the end of the chosen LTS version’s support period. For example, a subscription for FreeRTOS 202012.01 LTS, whose LTS period ends March 2023, may be renewed annually for up to 10 years (i.e., March 2033).

You can find more information on the FreeRTOS feature page. Please send us feedback on the forum of FreeRTOS or AWS Support.

Sign up to get periodic updates on when and how you can subscribe to FreeRTOS EMP.

Channy

New – Securely manage your AWS IoT Greengrass edge devices using AWS Systems Manager

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/new-securely-manage-your-aws-iot-greengrass-edge-devices-using-aws-systems-manager/

A header image with the text AWS IoT Greengrass announces AWS System Manager

In 2020, we launched AWS IoT Greengrass 2.0, an open-source edge runtime and cloud service for building, deploying, and managing device software and applications. Today, we’re very excited to announce the ability to securely manage your AWS IoT Greengrass edge devices using AWS Systems Manager (SSM).

Managing vast fleets of varying systems and applications remotely can be a challenge for administrators of edge devices. AWS IoT Greengrass was built to enable these administrators to manage their edge device application stack. While this addressed the needs of many typical edge device administrators, system software on these devices still needed to be updated and maintained through operational policies consistent with those of their broader IT organizations. To this end, administrators would typically have to build or integrate tools to create a centralized interface for managing their edge and IT device software stacks – from security updates, to remote access, and operating system patches.

Until today, IT administrators have had to build or integrate custom tools to make sure edge devices can be managed alongside EC2 and on-prem instances, through a consistent set of policies. At scale, managing device and systems software across a wide variety of edge and IT systems becomes a significant investment in time and money. This is time that could be better spent deploying, optimizing, and managing the very edge devices that they’re maintaining.

What’s New?
Today, we have integrated IoT Greengrass and Systems Manager to simplify the management and maintenance of system software for edge devices. When coupled with the AWS IoT Greengrass Client Software, edge device administrators now can remotely access and securely manage with the multitude of devices that they own – from OS patching, to application deployments. Additionally, regularly scheduled operations that maintain edge compute systems can be automated, all without the need for creating additional custom processes. For IT administrators, this release gives a complete overview of all of their devices through a centralized interface, and a consistent set of tools and policies with the AWS Systems Manager.

For customers new to the AWS IoT Greengrass platform, the integration with Systems Manager simplifies setup even further with a new on- boarding wizard that can reduce the time it takes to create operational management systems for edge devices from weeks to hours.

How is this achieved?
This new capability is enabled by the AWS Systems Manager (SSM) Agent. As of today, customers can deploy the AWS Systems Manager Agent, via the AWS IoT Greengrass console, to their existing edge devices. Once installed on each device, AWS Systems Manager will list all of the devices in the Systems Manager Console, thereby giving administrators and IoT stakeholders an overview of their entire fleet. When coupled with the AWS IoT Greengrass console, administrators can manage their newly configured devices remotely; patching or updating operating systems, troubleshooting remotely, and deploying new applications, all through a centralized, integrated user interface. Devices can be patched individually, or in groups organized by tags or resource groups.

Further information
These new features are now available in all regions where AWS Systems Manager and AWS IoT Greengrass are available. To get started, please visit the IoT Greengrass home page.

Top Announcements of AWS re:Invent 2021

Post Syndicated from AWS News Blog Team original https://aws.amazon.com/blogs/aws/top-announcements-of-aws-reinvent-2021/

Welcome to AWS re:Invent! From Nov. 29-Dec. 3, 2021, we’ll update this page daily with the most noteworthy launches from our biggest event of the year. AWS Chief Evangelist Jeff Barr and our team of AWS developer advocates from around the globe share the news and offer helpful tips for getting started with all the latest AWS releases.

More ways to learn:

(This post was last updated: 12:42 a.m., PST, Nov. 29, 2021.)


Quick category links:
Internet of Things |
Security

Internet of Things

Preview – AWS IoT RoboRunner for Building Robot Fleet Management Applications

AWS IoT RoboRunner is a new robotics service that makes it easier for enterprises to build and deploy applications that help fleets of robots work seamlessly together.

Security

Amazon CodeGuru Reviewer Introduces Secrets Detector to Identify Hardcoded Secrets and Secure Them with AWS Secrets Manager
The new Amazon CodeGuru Reviewer Secrets Detector is an automated tool that helps developers detect secrets in source code or configuration files, such as passwords, API keys, SSH keys, and access tokens.

Back to Top

Preview – AWS IoT RoboRunner for Building Robot Fleet Management Applications

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/preview-aws-iot-roborunner-for-building-robot-fleet-management-applications/

In 2018, we launched AWS RoboMaker, a cloud-based simulation service that enables robotics developers to run, scale, and automate simulation without managing any infrastructure. As we worked with robot developers and operators, we have repeatedly heard that they face challenges in operating different robot types in their automation efforts, including autonomous guided vehicles (AGV), autonomous mobile vehicles (AMR), and robotic manipulators.

Many customers choose different types of robots – often from different vendors in a single facility. Robot operator want to access the unified data required to build applications that work across a fleet of robots. However, when a new robot is added to an autonomous operation, complex and time-consuming software integration work is required to connect the robot control software to work management systems.

Today, we are launching a public preview of AWS IoT RoboRunner, a new robotics service that makes it easier for enterprises to build and deploy applications that help fleets of robots work seamlessly together. AWS IoT RoboRunner lets you connect your robots and work management systems, thereby enabling you to orchestrate work across your operation through a single system view.

This new service builds on the same technology used in Amazon fulfillment centers, and now we are excited to make it available to all developers to build advanced robotics applications for their businesses.

AWS IoT RoboRunner in Action
You can create a single facility (e.g., site name and location) in the AWS Management Console to get started with AWS IoT RoboRunner. Behind the scenes, AWS IoT RoboRunner automatically creates centralized repositories for storing facility, robot, destination, and task data. Then, the robots working on this site are setup as a “Fleet”, and each individual robot is setup in AWS IoT RoboRunner as a “Robot” within a fleet.

You can download the Fleet Gateway Library to develop integration codes for connecting your robots and WMS systems with AWS IoT RoboRunner to send and receive data from individual robot fleets. You can also develop the first robotics management application using the Task Manager Library and deploy Task Manager codes as an AWS Lambda function and Fleet Gateway codes on-premises as an AWS IoT Greengrass component.

To enable a single-system view of the robots, status of the systems, and progress of tasks on the same interface, AWS IoT RoboRunner provides APIs that let you build a user application. AWS IoT RoboRunner provides sample applications for allocating tasks to robot fleets so that you can get started quickly. You can customize the task allocation code with business requirements that align to your use case.

Learn more by reading Getting started with AWS IoT RoboRunner in the AWS IoT RoboRunner Developer Guide. Watch a quick introductory video about AWS IoT RoboRunner for more information.

Try Public Preview Now
AWS IoT RoboRunner is now available in public preview, and you can start using them today in the US East (N. Virginia) and Europe (Frankfrut) Regions. There will be no additional cost to use this feature during the preview period.

You can send feedback to [email protected], the AWS forum for AWS IoT, or through your usual AWS Support contacts.

Channy

New Amazon Virtual Andon 3.0 – Automate Issue Resolution via APIs and Predictive Services

Post Syndicated from Ajay Swamy original https://aws.amazon.com/blogs/architecture/new-amazon-virtual-andon-3-0-automate-issue-resolution-via-apis-and-predictive-services/

Developing a modern manufacturing enterprise requires careful thought and attention to several priorities. Predictive maintenance and issue resolution automation are likely high on your list. Maximizing your operational efficiency and optimizing output are critical in this competitive global market. As demand grows, manufacturers are under pressure to fulfill increased production needs.

A recent report from McKinsey on Industry 4.0 technologies discusses pandemic implementations such as digital issue detection and resolution. These solutions are critical for crisis response in the COVID-19 era. 30% of manufacturers have highlighted increased operational productivity, reduced time-to-market, and reduction in cost as major strategic imperatives for their Industry 4.0 transformation.

Amazon Virtual Andon 3.0 (AVA) is an Amazon Web Services (AWS) Solution that provides a scalable, digital Andon system to help detect and resolve issues. It optimizes processes, supports your transition to predictive maintenance, and helps prevent future equipment failures.

Overview of new features

AVA provides factory and fulfillment center associates with an intuitive, responsive web interface and workflow. This can be used to raise issues, route those issues to the appropriate engineers, and resolve them in a timely way. With AVA, you can associate explanatory root causes with resolved issues for more insightful reporting. Issue raising and resolution happen digitally. There’s no need for manual intervention (such as raising a manual alarm at a factory workstation.)

AVA introduces the capability to raise issues directly from devices and automated APIs. Flexible integrations can be made directly with your factory devices and systems. Additionally, the APIs integrate with Amazon Machine Learning (ML) services, such as Amazon Lookout for Equipment and Amazon Lookout for Vision. This enables you to automate ML inference into your AVA-based workflows.

With AVA 3.0, you can now monitor your factory floors for disruptions in near-real-time. You can respond and resolve issues quickly, minimizing production disruption. AVA 3.0 provides users with an analytics pipeline so you can create dashboards and custom reports.

Introducing GraphQL APIs

The solution introduces GraphQL APIs via AWS AppSync, secured through AWS Identity and Access Management (IAM) policies. This powers the web interface and functionality to create, route, and manage issues. With AVA GraphQL APIs, you can create and manage site hierarchies, devices, events, and raise and manage issues. For example, you can create an issue by calling the createIssue API to raise issues and track them to completion automatically.

createIssue
{
    id: "<string>",
    siteName: "<string>",
    areaName: "<string>",
    stationName: "<string>",
    deviceName: "<string>",
    processName: "<string>",
    eventId: "<string>",
    eventDescription: "<string>",
    eventType: "<string>",
    issueSource: "<string>",
    priority: "<string>",
    status: "<string>",
    created: "AWSDateTime",
    acknowledged: "AWSDateTime",
    closed: "AWSDateTime",
    acknowledgedTime: "<number>",
    resolutionTime: "<number>",
    createdBy: "<string>",
    additionalDetails: "<string>"
  }

Integration with Amazon Machine Learning services to raise issues automatically

With AVA APIs, you can integrate with predictive services like Amazon Lookout for Equipment (L4E). AVA provides an AWS Lambda function for detecting anomalies generated via L4E. It automatically calls the APIs to create issues for abnormal events.

When L4E detects anomalies in your machinery or production line, AVA can automatically raise issues for those anomalies (Figure 1). This gives you the visibility and tracking mechanism to ensure anomalies are resolved. You can create automated events and issues via APIs by integrating with Amazon Lookout for Equipment. You’re able to track anomalies with their details in near-real-time.

Figure 1. AVA displays anomalies raised from L4E. A prediction score of > 0 indicates that an anomaly has been detected, and AVA will raise an issue with the underlying sensor details.” width=”1063″ height=”611″></p>
<p id=Figure 1. AVA displays anomalies raised from L4E. A prediction score of > 0 indicates that an anomaly has been detected, and AVA will raise an issue with the underlying sensor details.

Direct device integration

AVA provides the capability to integrate with IoT devices via AWS IoT Core. You can configure your IoT devices to send data to the ava/issues AWS IoT Core topic. Additionally, AVA can send messages to the ava/devices AWS IoT Core topic to automatically raise issues via MQTT or HTTPS. It maps your machine name to an AVA device and a tag/value combination to an AVA event.

{
  "id": <ID!>,
  "eventId": String,
  "eventDescription": String,
  "type": String,
  "priority": String,
  "siteName": String,
  "processName": String,
  "areaName":" String,
  "stationName": String,
  "deviceName": String,
  "created": AWSDateTime,
  "acknowledged": AWSDateTime,
  "closed": AWSDateTime,
  "status": "open"
}

Analytics pipeline for custom dashboards

AVA uses Amazon DynamoDB to store factory configuration and issues data. All AVA data is exported from the DynamoDB database to an Amazon S3 bucket via an AWS Glue workflow. You can then use Amazon Athena to query underlying data and create custom reports using business intelligence (BI) solutions like Amazon QuickSight (Figure 2). With the analytics pipeline, you can create custom dashboards and monitor your factory operations holistically.

Figure 2. Custom dashboard generated via Amazon QuickSight

Figure 2. Custom dashboard generated via Amazon QuickSight

Architecture and workflow

Figure 3. AVA 3.0 architecture

Figure 3. AVA 3.0 architecture

The AWS CloudFormation template deploys the following infrastructure, shown in Figure 3:

  1. The AWS CloudFormation template provides an Amazon CloudFront web interface that deploys into an Amazon Simple Storage Service (Amazon S3) bucket configured for web hosting.
  2. An Amazon Cognito user pool allows this solution’s administrators to register users and groups using the web interface.
  3. AWS AppSync GraphQL APIs and AWS Amplify power the web interface. Amazon DynamoDB tables store the factory data.
  4. An AWS IoT rule engine helps you monitor manufacturing workstations or devices for events. It then routes the event to the correct engineer for resolution in real time.
  5. Authorized users can interact with and receive notifications from this solution. An AWS Lambda function and Amazon Simple Notification Service (Amazon SNS) send emails and SMS notifications.
  6. Issues created, acknowledged, and closed in the web interface are recorded and updated using AWS AppSync and DynamoDB.
  7. The AWS AppSync GraphQL APIs can be called directly with HTTP POST requests.
  8. If you are using L4E to monitor your machines, enter the name of the Amazon S3 bucket where inference files will be delivered in the Anomaly Detection Output Bucket CloudFormation parameter. This solution can be configured to automatically raise issues if an anomaly is detected.
  9. When the Activate Glue Workflow CloudFormation parameter is set to “Yes”, an AWS Glue workflow will be created to extract data from DynamoDB. It delivers this data via an AWS Glue Data Catalog into Amazon S3. For more information, refer to the Data Analysis section.
  10. You can use an existing SAML provider as an additional identity provider for access to this solution. You can configure the Amazon Cognito Domain Prefix, SAML Provider Name, and SAML Provider Metadata URL CloudFormation parameters. For more information, refer to SAML identity provider.

An intuitive UI with nested events

With AVA, you also can create nested events / issues, so you can more easily get to the root of the problem and resolve issues quickly. The solution builds upon the same intuitive UI as highlighted in this previous Amazon Virtual Andon blog post. In addition, it allows you to manage events granularly, create subevents, raise, and resolve issues and sub issues (Figure 4).

Figure 4. AVA allows you to raise issues and sub issues (for example, ‘Out of boxes’ allows you to specify the kind of boxes you need to resolve the issue)

Figure 4. AVA allows you to raise issues and sub issues (for example, ‘Out of boxes’ allows you to specify the kind of boxes you need to resolve the issue)

Conclusion

Amazon Virtual Andon (AVA) is a self-deployable solution that provides you the digital capability to create, route, manage, and resolve issues. With AVA, you can monitor your overall enterprise for issues and engage with the right engineers to resolve them promptly. It offers a clear, intuitive user interface and straightforward workflow to help team members resolve issues.

Get started with Amazon Virtual Andon today.

Architecture Monthly Magazine: IoT for the Edge

Post Syndicated from Jane Scolieri original https://aws.amazon.com/blogs/architecture/architecture-monthly-magazine-iot-for-the-edge/

Internet of Things (IoT) for the Edge encompasses so many devices and industries, that we couldn’t
pick just one photo for our cover. We include IoT use cases from manufacturing, fitness, ocean research, and agriculture. And these represent only a fraction of what is possible. By moving certain workloads to the edge, your devices communicate with local compute resources and can respond more quickly to changes.

AWS edge services deliver data processing, analysis, and storage close to your endpoints, allowing you to deploy APIs and tools to locations outside AWS data centers. You can harness the data generated by your IoT edge devices and enable them to act intelligently with AWS IoT services.

We’d like to thank our experts, Olawale Oladehin, Head of Worldwide Solutions Architect – IoT, Maggie Tallman, Worldwide Go-To-Market Manager – IoT & Robotics, and Richard Elberger, IoT Principal Technologist, AWS. We are also pleased to have a contribution by one of our Customers, Jaime González, Chief Technology Officer, Pentasoft. Special thanks go to Ryan Burke, Sr. Application Architect, and Channa Samynathan, Specialist Solutions Architect – IoT, for their invaluable help shepherding this issue.

Please give us your feedback! Take our survey.

You can also include your comments on the Amazon Kindle page. View past issues and reach out to [email protected] anytime with your questions and comments.

In this month’s IoT for the Edge issue:

  • Ask an Expert: Maggie Tallman, Worldwide Go-To-Market Manager – IoT & Robotics, and Olawale Oladehin, Head of Worldwide Solutions Architect – IoT
  • Customer Conversations: Jaime González, Chief Technology Officer, Pentasoft
  • Ask an Expert, Hardware Security: Richard Elberger, IoT Principal Technologist, AWS
  • Whitepaper: Security at the Edge: Core Principles
  • Case Study: Seafloor Systems Saves 4 Hours of Labor per Robot Build Using AWS IoT Greengrass
  • Implementation Guide: Monitoring River Levels Using LoRaWAN
  • Blog: Run ML inference on AWS Snowball Edge with Amazon SageMaker Edge Manager and AWS IoT Greengrass
  • Reference Architecture: Using Computer Vision for Product Quality Analysis in Plants
  • Case Study: Coca-Cola İçecek Improves Operational Performance Using AWS IoT SiteWise
  • Quick Start: The Industrial Machine Connectivity (IMC) Quick Start
  • Blog: Automated Device Provisioning to AWS IoT Core Using 1NCE Global SIM
  • Solution: Machine to Cloud Connectivity Framework
  • Reference Architecture: Predictive Equipment Health for Utilities
  • Blog: Autonomous vehicle data collection with AWS Snowcone and AWS IoT Greengrass
  • Solution: AWS Connected Vehicle Solution
  • Videos:
    • 30MHz: Building A Smart Agriculture Solution For Indoor Farms And Greenhouses On
      AWS
    • Evolving at the Edge with the AWS Snow Family
    • Data Residency at the Edge: AWS Outposts Inside Out
    • Orangetheory Fitness: Taking a Data-Driven Approach to Improving Health and Wellness (Special)
    • Data Migration and Edge Computing with the AWS Snow Family
    • All in with James Gosling: Behind the Scenes with AWS IoT Greengrass V2

Download the Magazine

How to access the magazine

View and download past issues as PDFs on the AWS Architecture Monthly webpage.
Readers in the US, UK, Germany, and France can subscribe to the Kindle version of the magazine at Kindle Newsstand.
Visit Flipboard, a personalized mobile magazine app that you can also read on your computer.
We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us a star rating and comment on the Amazon Kindle Newsstand page or contact us anytime at [email protected].

IoT gets a machine learning boost, from edge to cloud

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/iot-gets-a-machine-learning-boost-from-edge-to-cloud/

Today, it’s easy to run Edge Impulse machine learning on any operating system, like Raspberry Pi OS, and on every cloud, like Microsoft’s Azure IoT. Evan Rust, Technology Ambassador for Edge Impulse, walks us through it.

Building enterprise-grade IoT solutions takes a lot of practical effort and a healthy dose of imagination. As a foundation, you start with a highly secure and reliable communication between your IoT application and the devices it manages. We picked our favorite integration, the Microsoft Azure IoT Hub, which provides us with a cloud-hosted solution backend to connect virtually any device. For our hardware, we selected the ubiquitous Raspberry Pi 4, and of course Edge Impulse, which will connect to both platforms and extend our showcased solution from cloud to edge, including device authentication, out-of-box device management, and model provisioning.

From edge to cloud – getting started 

Edge machine learning devices fall into two categories: some are able to run very simple models locally, and others have more advanced capabilities that allow them to be more powerful and have cloud connectivity. The second group is often expensive to develop and maintain, as training and deploying models can be an arduous process. That’s where Edge Impulse comes in to help to simplify the pipeline, as data can be gathered remotely, used effortlessly to train models, downloaded to the devices directly from the Azure IoT Hub, and then run – fast.

This reference project will serve you as a guide for quickly getting started with Edge Impulse on Raspberry Pi 4 and Azure IoT, to train a model that detects lug nuts on a wheel and sends alerts to the cloud.

Setting up the hardware

Hardware setup for Edge Impulse Machine Learning
Raspberry Pi 4 forms the base for the Edge Impulse machine learning setup

To begin, you’ll need a Raspberry Pi 4 with an up-to-date Raspberry Pi OS image which can be found here. After flashing this image to an SD card and adding a file named wpa_supplicant.conf

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=<Insert 2 letter ISO 3166-1 country code here>

network={
	ssid="<Name of your wireless LAN>"
	psk="<Password for your wireless LAN>"
}

along with an empty file named ssh (both within the /boot directory), you can go ahead and power up the board. Once you’ve successfully SSH’d into the device with 

$ ssh pi@<IP_ADDRESS>

and the password raspberry, it’s time to install the dependencies for the Edge Impulse Linux SDK. Simply run the next three commands to set up the NodeJS environment and everything else that’s required for the edge-impulse-linux wizard:

$ curl -sL https://deb.nodesource.com/setup_12.x | sudo bash -
$ sudo apt install -y gcc g++ make build-essential nodejs sox gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-base gstreamer1.0-plugins-base-apps
$ npm config set user root && sudo npm install edge-impulse-linux -g --unsafe-perm

Since this project deals with images, we’ll need some way to capture them. The wizard supports both the Pi Camera modules and standard USB webcams, so make sure to enable the camera module first with 

$ sudo raspi-config

if you plan on using one. With that completed, go to the Edge Impulse Studio and create a new project, then run the wizard with 

$ edge-impulse-linux

and make sure your device appears within the Edge Impulse Studio’s device section after logging in and selecting your project.

Edge Impulse Machine Learning screengrab

Capturing your data

Training accurate machine learning models requires feeding plenty of varied data, which means a lot of images are required. For this use case, I captured around 50 images of a wheel that had lug nuts on it. After I was done, I headed to the Labeling queue in the Data Acquisition page and added bounding boxes around each lug nut within every image, along with every wheel.

Edge Impulse Machine Learning screengrab

To add some test data, I went back to the main Dashboard page and clicked the Rebalance dataset button, which moves 20% of the training data to the test data bin. 

Training your models

So now that we have plenty of training data, it’s time to do something with it, namely train a model. The first block in the impulse is an Image Data block, and it scales each image to a size of 320 by 320 pixels. Next, image data is fed to the Image processing block which takes the raw RGB data and derives features from it.

Edge Impulse Machine Learning screengrab

Finally, these features are sent to the Transfer Learning Object Detection model which learns to recognize the objects. I set my model to train for 30 cycles at a learning rate of .15, but this can be adjusted to fine-tune the accuracy.

As you can see from the screenshot below, the model I trained was able to achieve an initial accuracy of 35.4%, but after some fine-tuning, it was able to correctly recognize objects at an accuracy of 73.5%.

Edge Impulse Machine Learning screengrab

Testing and deploying your models

In order to verify that the model works correctly in the real world, we’ll need to deploy it to our Raspberry Pi 4. This is a simple task thanks to the Edge Impulse CLI, as all we have to do is run 

$ edge-impulse-linux-runner

which downloads the model and creates a local webserver. From here, we can open a browser tab and visit the address listed after we run the command to see a live camera feed and any objects that are currently detected. 

Integrating your models with Microsoft Azure IoT 

With the model working locally on the device, let’s add an integration with an Azure IoT Hub that will allow our Raspberry Pi to send messages to the cloud. First, make sure you’ve installed the Azure CLI and have signed in using az login. Then get the name of the resource group you’ll be using for the project. If you don’t have one, you can follow this guide on how to create a new resource group. After that, return to the terminal and run the following commands to create a new IoT Hub and register a new device ID:

$ az iot hub create --resource-group <your resource group> --name <your IoT Hub name>
$ az extension add --name azure-iot
$ az iot hub device-identity create --hub-name <your IoT Hub name> --device-id <your device id>

Retrieve the connection string with 

$ az iot hub device-identity connection-string show --device-id <your device id> --hub-name <your IoT Hub name>
Edge Impulse Machine Learning screengrab

and set it as an environment variable with 

$ export IOTHUB_DEVICE_CONNECTION_STRING="<your connection string here>" 

in your Raspberry Pi’s SSH session, as well as 

$ pip install azure-iot-device

to add the necessary libraries. (Note: if you do not set the environment variable or pass it in as an argument, the program will not work!) The connection string contains the information required for the device to establish a connection with the IoT Hub service and communicate with it. You can then monitor output in the Hub with 

$ az iot hub monitor-events --hub-name <your IoT Hub name> --output table

 or in the Azure Portal.

To make sure it works, download and run this example to make sure you can see the test message. For the second half of deployment, we’ll need a way to customize how our model is used within the code. Thankfully, Edge Impulse provides a Python SDK for this purpose. Install it with 

$ sudo apt-get install libatlas-base-dev libportaudio0 libportaudio2 libportaudiocpp0 portaudio19-dev
$ pip3 install edge_impulse_linux -i https://pypi.python.org/simple

There’s some simple code that can be found here on Github, and it works by setting up a connection to the Azure IoT Hub and then running the model.

Edge Impulse Machine Learning screengrab

Once you’ve either downloaded the zip file or cloned the repo into a folder, get the model file by running

$ edge-impulse-linux-runner --download modelfile.eim

inside of the folder you just created from the cloning process. This will download a file called modelfile.eim. Now, run the Python program with 

$ python lug_nut_counter.py ./modelfile.eim -c <LUG_NUT_COUNT>

where <LUG_NUT_COUNT> is the correct number of lug nuts that should be attached to the wheel (you might have to use python3 if both Python 2 and 3 are installed).

Now whenever a wheel is detected the number of lug nuts is calculated. If this number falls short of the target, a message is sent to the Azure IoT Hub.

And by only sending messages when there’s something wrong, we can prevent an excess amount of bandwidth from being taken due to empty payloads.

The possibilities are endless

Imagine utilizing object detection for an industrial task such as quality control on an assembly line, or identifying ripe fruit amongst rows of crops, or detecting machinery malfunction, or remote, battery-powered inferencing devices. Between Edge Impulse, hardware like Raspberry Pi, and the Microsoft Azure IoT Hub, you can design endless models and deploy them on every device, while authenticating each and every device with built-in security.

You can set up individual identities and credentials for each of your connected devices to help retain the confidentiality of both cloud-to-device and device-to-cloud messages, revoke access rights for specific devices, transmit code and services between the cloud and the edge, and benefit from advanced analytics on devices running offline or with intermittent connectivity. And if you’re really looking to scale your operation and enjoy a complete dashboard view of the device fleets you manage, it is also possible to receive IoT alerts in Microsoft’s Connected Field Service from Azure IoT Central – directly.

Feel free to take the code for this project hosted here on GitHub and create a fork or add to it.

The complete project is available here. Let us know your thoughts at [email protected]. There are no limits, just your imagination at work.

The post IoT gets a machine learning boost, from edge to cloud appeared first on Raspberry Pi.

Building a Data Pipeline for Tracking Sporting Events Using AWS Services

Post Syndicated from Ashwini Rudra original https://aws.amazon.com/blogs/architecture/building-a-data-pipeline-for-tracking-sporting-events-using-aws-services/

In an evolving world that is increasingly connected, data-centric, and fast-paced, the sports industry is no exception. Amazon Web Services (AWS) has been helping customers in the sports industry gain real-time insights through analytics. You can re-invent and reimagine the fan experience by tracking sports actions and activities. In this blog post, we will highlight common architectural and design patterns for building a data pipeline to track sporting events in real time.

The sports industry is largely comprised of two subsegments: participatory and spectator sports. Participatory sports, for example fitness, golf, boating, and skiing, comprise the largest share of the market. Spectator sports, such as teams/clubs/leagues, individual sports, and racing, are expected to be the fastest growing segment. Sports teams/leagues/clubs comprise the largest share of the Spectator sports segment, and is growing most rapidly.

IoT data pipeline architecture overview

Let’s discuss the infrastructure in three parts:

  1. Infrastructure at the arena itself
  2. Processing data using AWS services
  3. Leveraging this analysis using a graphics overlay (this can be especially useful for broadcasters, OTT channels, and arena users)

Data-gathering devices

Radio-frequency identification (RFID) chips or IoT devices can be worn by players or embedded in the playing equipment. These devices emit 20–50 messages per second. These messages are collected and output using JSON. This information may include player coordinate positions, player speed, statistics, health information, or more. To process the game, leagues, coaches, or broadcasters can analyze this data using analytics tools and/or machine learning.

Figure 1. Data pipeline architecture using AWS Services

Figure 1. Data pipeline architecture using AWS Services

Processing data, feature engineering, and model training at AWS

Use serverless services from AWS when possible in order to keep your solution scalable and cost-efficient. This also helps with operational overhead for teams. You can use the Kinesis family of services for stream ingestion and processing. The streaming data from hundreds to thousands of IoT sources (from equipment and clothing) can be fed to Amazon Kinesis Data Streams (KDS). KDS and Amazon Kinesis Data Firehose provide a buffering mechanism for streaming data before it lands on Amazon Simple Storage Service (S3). With Amazon Kinesis Data Analytics, you can process and analyze Kinesis stream data using powerful SQL, Apache Flink, or Beam. Kinesis Data Analytics also supports building applications in SQL, Java, Scala, and Python. With this service, you can quickly author and run powerful SQL code against Amazon Kinesis Streams as your source. This way you can perform time series analytics, feed real-time dashboards, and create real-time metrics. Read more about Amazon Kinesis Data Analytics for SQL Applications.

You might want to transform or enhance the streaming data before it is delivered to Amazon S3. Amazon Kinesis Data Firehose can be used with an AWS Lambda function to do the transformation. Let’s say you have a player prediction timestamp that you want to represent in a different time format to different ML algorithms. Lambda can process and transform this data. Kinesis Data Firehose will deliver the transformed and raw data to the destination (Amazon S3). This can occur after the specific buffering size or when the buffering interval is reached, whichever happens first.

For more complex transformations, AWS Glue can be used. For example, once the data lands in Amazon S3, you can start preparing and aggregating the training dataset using Amazon SageMaker Data Wrangler. As part of the feature engineering process, you can do the following:

  • Transform the data
  • Delete unneeded columns
  • Impute missing values
  • Perform label encoding
  • Use the quick model option to get a sense of which features are adding predictive power as you progress with your data preparation

All the data preparation and feature engineering tasks can be performed from Data Wrangler’s single visual interface.

Once data is prepared in Amazon S3, Amazon SageMaker can be used for model training. In soccer, you can predict a goal percentage based on the player’s position, acceleration, and past performance history.  SageMaker provides several built-in algorithms that can be trained. For real-time predictions, Amazon API Gateway provides an API layer to clients like an OTT, broadcasting service, or a web browser. API Gateway can invoke a Lambda function, with logic to call a SageMaker endpoint and persist the output to the database. This data can be used later on for further analysis or to fine-tune your models.

Figure 2. Deliver real-time prediction using SageMaker

Figure 2. Deliver real-time prediction using Amazon SageMaker

Computer vision-based object detection techniques can be very useful in Sports. These techniques use deep learning algorithms to predict the pass probability, game player face-off, or win prediction. For the sports industry, object detection technology like these are crucial. They obviate the need for sensors. Real-time object identification can be used to:

  • Generate new advanced analytics regarding player and team performance
  • Aid game officials in making correct calls
  • Provide fans an improved and more data-rich viewing experience

Read Football tracking in the NFL with Amazon SageMaker for more information on how to track using broadcast video data. Using SageMaker, you can train object detection models that analyze thousands of images. You can then locate and classify the football itself, and distinguish it from background objects.

Creating a graphics overlay

When you have the ML inference data and video ingestion ready, you may want to represent this data on your broadcasted video. The graphic overlay feature lets you insert an image (a BMP, PNG, or TGA file) at a specified time. It is displayed as a static overlay on the underlying video for a specified duration. The motion graphic overlay feature lets you insert an animation (a MOV or SWF file, or a series of PNG files) on the underlying video. This can be displayed at a specified time for a specified duration.

For example, a player’s motion prediction can be inserted on video during a game, through a RESTful API call of ML inferences. You can use AWS Elemental Live to achieve this. Read about AWS Elemental Live Graphic Overlay at AWS documentation.

Reducing latency

You may want to reduce latency for analytics such as for player health and safety. Use video, data, or machine learning processing at the arena using AWS Outposts. You can also use AWS Wavelength along with 5G infrastructure. For more information, read Catch Important Moments in Sports with 5G and AWS Wavelength.

Summary

In this blog, we’ve highlighted how customers in the sports industry are using AWS to increase the quality of the game, and enhance the sports fan’s experience. The following benefits can be achieved by building a data pipeline for tracking sporting events using AWS services:

  • Amazon Kinesis collects, processes, and analyzes in-game streaming data in real time. This way both teams and fans get timely insights and can react quickly to new information.
  • The serverless nature of this architecture enables a cost-effective, scalable, and operationally efficient environment for customers.
  • Amazon Machine Learning services like Amazon SageMaker can be used to enrich the fan viewing experience. It presents in-game predictions such as who will score next, or which team will win the game.

Visit our AWS Sports Partnerships page for more information on how AWS is changing the game.

Audit Your Supply Chain with Amazon Managed Blockchain

Post Syndicated from Edouard Kachelmann original https://aws.amazon.com/blogs/architecture/audit-your-supply-chain-with-amazon-managed-blockchain/

For manufacturing companies, visibility into complex supply chain processes is critical to establishing resilient supply chain management. Being able to trace events within a supply chain is key to verifying the origins of parts for regulatory requirements, tracing parts back to suppliers if issues arise, and for contacting buyers if there is a product/part recall.

Traditionally, companies will create their own ledger that can be reviewed and shared with third parties for future audits. However, this process takes time and requires verifying the data’s authenticity. In this blog, we offer a solution to audit your supply chain. Our solution allows supply chain participants to safeguard product authenticity and prevent fraud, increase profitability by driving operational efficiencies, and enhance visibility to minimize disputes across parties.

Benefits of blockchain

Blockchain technology offers a new approach for tracking supply chain events. Blockchains are immutable ledgers that allow you to cryptographically prove that, since being written, each transaction remains unchanged. For a supply chain, this immutability is beneficial from a process standpoint. Auditing a supply chain becomes much simpler when you are certain that no one has altered the manufacturing, transportation, storage, or usage history of a given part or product in the time since a failure occurred.

In addition to providing an immutable system of record, many blockchain protocols can run programmable logic written as code in a decentralized manner. This code is often referred to as a “smart contract,” which enables multi-party business logic to run on the blockchain. This means that implementing your supply chain on a blockchain allows members of the network (like retailers, suppliers, etc.) to process transactions that only they are authorized to process.

Benefits of Amazon Managed Blockchain

Amazon Managed Blockchain allows customers to join either private Hyperledger Fabric networks or the Public Ethereum network. On Managed Blockchain, you are relieved of the undifferentiated heavy lifting associated with creating, configuring, and managing the underlying infrastructure for a Hyperledger Fabric network. Instead, you can focus your efforts on mission-critical value drivers like building consortia or developing use case specific components. This allows you to create and manage a scalable Hyperledger Fabric network that multiple organizations can join from their AWS account.

IoT-enabled supply chain architecture

Organizations within the Industrial Internet of Things (IIoT) space want solutions that allow them to monitor and audit their supply chain for strict quality control and accurate product tracking. Using AWS IoT will allow you to realize operational efficiency at scale. The IoT-enabled equipment on their production plant floor records data such as load, pressure, temperature, humidity, and assembly metrics through multiple sensors. Data can be transmitted in real time directly to the cloud or through an on-premises AWS Internet of Things (IoT) gateway (such as any AWS IoT Greengrass compatible hardware) into AWS IoT for storage and analytics. These devices or IoT gateway will then send MQTT messages to the AWS IoT Core endpoint.

This solution provides a pipeline to ingest data provided by IoT. It stores this data in a private blockchain network that is only accessible within member organizations. This is your immutable single source of truth for future audits. In this solution, the Hyperledger Fabric network on Managed Blockchain includes two members, but it can be extended to additional organizations that are part of the supply chain as needed.

Reference architecture for an IoT-enabled supply chain consisting of a retailer and a manufacturer

Figure 1. Reference architecture for an IoT-enabled supply chain consisting of a retailer and a manufacturer

The components of this solution are:

  • IoT enabled sensors – These sensors are directly mounted on each piece of factory equipment throughout the supply chain. They publish data to the IoT gateway. For testing purposes, you can start with the IoT Device Simulator solution to create and simulate hundreds of connected devices.
  • AWS IoT Greengrass (optional) – This gateway provides a secure way to seamlessly connect your edge devices to any AWS service. It also enables local processing, messaging, data management, machine learning (ML) inference, and offers pre-built components such as protocol conversion to MQTT if your sensors only have an OPCUA or Modbus interface.
  • AWS IoT Core – AWS IoT Core subscribes to IoT topics published by the IoT devices or gateway and ingests data into the AWS Cloud for analysis and storage.
  • AWS IoT rule – Rules give your devices the ability to interact with AWS services. Rules are analyzed and actions are performed based on the MQTT topic stream. Here, we initiate a serverless Lambda function to extract, transform, and publish data to the Fabric Client. We could use another rule for HTTPS endpoint to directly address requests to a private API Gateway.
  • Amazon API Gateway – The API Gateway provides a REST interface to invoke the AWS Lambda function for each of the API routes deployed. API Gateway allows you to handle request authorization and authentication, before passing the request on to Lambda.
  • AWS Lambda for the Fabric Client – Using AWS Lambda with the Hyperledger Fabric SDK installed as a dependency, you can communicate with your Hyperledger Fabric Peer Node(s) to write and read data from the blockchain. The peer nodes run smart contracts (referred to as chaincode in Hyperledger Fabric), endorse transactions, and store a local copy of the ledger.
  • Managed Blockchain – Managed Blockchain is a fully managed service for creating and managing blockchain networks and network resources using open-source frameworks. In our solution, an endpoint within the customer virtual private cloud (VPC) is used for the Fabric Client. It interacts with your Hyperledger Fabric network on Managed Blockchain components that run within a VPC for your Managed Blockchain network.
    • Peer node – A peer node endorses blockchain transactions and stores the blockchain ledger. In production, we recommend creating a second peer node in another Availability Zone to serve as a fallback if the first peer becomes unavailable.
    • Certificate Authority – Every user who interacts with the blockchain must first register and enroll with their certificate authority.

Choosing a Hyperledger Fabric edition

Edition Network size Max. # of members Max. # of peer nodes per member Max # of channels per network Transaction throughput and availability
Starter Test or small production 5 2 3 Lower
Standard Large production 14 3 8 Higher

Our solution allows multiple parties to write and query data on a private Hyperledger Fabric blockchain managed by Amazon Managed Blockchain. This enhances consumer experience by reducing the overall effort and complexity with getting insight into supply chain transactions.

Conclusion

In this post, we showed you how Managed Blockchain, as well as other AWS services such as AWS IoT, can provide value to your business. The IoT-enabled supply chain architecture gives you a blueprint to realize that value. The value not only stems from the benefits of having a trustworthy and transparent supply chain, but also from the reliable, secure and scalable services that AWS provides.

Further reading

Enhancing Existing Building Systems with AWS IoT Services

Post Syndicated from Lewis Taylor original https://aws.amazon.com/blogs/architecture/enhancing-existing-building-systems-with-aws-iot-services/

With the introduction of cloud technology and by extension the rapid emergence of Internet of Things (IoT), the barrier to entry for creating smart building solutions has never been lower. These solutions offer commercial real estate customers potential cost savings and the ability to enhance their tenants’ experience. You can differentiate your business from competitors by offering new amenities and add new sources of revenue by understanding more about your buildings’ operations.

There are several building management systems to consider in commercial buildings, such as air conditioning, fire, elevator, security, and grey/white water. Each system continues to add more features and become more automated, meaning that control mechanisms use all kinds of standards and protocols. This has led to fragmented building systems and inefficiency.

In this blog, we’ll show you how to use AWS for the Edge to bring these systems into one data path for cloud processing. You’ll learn how to use AWS IoT services to review and use this data to build smart building functions. Some common use cases include:

  • Provide building facility teams a holistic view of building status and performance, alerting them to problems sooner and helping them solve problems faster.
  • Provide a detailed record of the efficiency and usage of the building over time.
  • Use historical building data to help optimize building operations and predict maintenance needs.
  • Offer enriched tenant engagement through services like building control and personalized experiences.
  • Allow building owners to gather granular usage data from multiple buildings so they can react to changing usage patterns in a single platform.

Securely connecting building devices to AWS IoT Core

AWS IoT Core supports connections with building devices, wireless gateways, applications, and services. Devices connect to AWS IoT Core to send and receive data from AWS IoT Core services and other devices. Buildings often use different device types, and AWS IoT Core has multiple options to ingest data and enabling connectivity within your building. AWS IoT Core is made up of the following components:

  • Device Gateway is the entry point for all devices. It manages your device connections and supports HTTPS and MQTT (3.1.1) protocols.
  • Message Broker is an elastic and fully managed pub/sub message broker that securely transmits messages (for example, device telemetry data) to and from all your building devices.
  • Registry is a database of all your devices and associated attributes and metadata. It allows you to group devices and services based upon attributes such as building, software version, vendor, class, floor, etc.

The architecture in Figure 1 shows how building devices can connect into AWS IoT Core. AWS IoT Core supports multiple connectivity options:

  • Native MQTT – Multiple building management systems or device controllers have MQTT support immediately.
  • AWS IoT Device SDK – This option supports MQTT protocol and multiple programming languages.
  • AWS IoT Greengrass – The previous options assume that devices are connected to the internet, but this isn’t always possible. AWS IoT Greengrass extends the cloud to the building’s edge. Devices can connect directly to AWS IoT Greengrass and send telemetry to AWS IoT Core.
  • AWS for the Edge partner products – There are several partner solutions, such as Ignition Edge from Inductive Automation, that offer protocol translation software to normalize in-building sensor data.
Data ingestion options from on-premises devices to AWS

Figure 1. Data ingestion options from on-premises devices to AWS

Challenges when connecting buildings to the cloud

There are two common challenges when connecting building devices to the cloud:

  • You need a flexible platform to aggregate building device communication data
  • You need to transform the building data to a standard protocol, such as MQTT

Building data is made up of various protocols and formats. Many of these are system-specific or legacy protocols. To overcome this, we suggest processing building device data at the edge, extracting important data points/values before transforming to MQTT, and then sending the data to the cloud.

Transforming protocols can be complex because they can abstract naming and operation types. AWS IoT Greengrass and partner products such as Ignition Edge make it possible to read that data, normalize the naming, and extract useful information for device operation. Combined with AWS IoT Greengrass, this gives you a single way to validate the building device data and standardize its processing.

Using building data to develop smart building solutions

The architecture in Figure 2 shows an in-building lighting system. It is connected to AWS IoT Core and reports on devices’ status and gives users control over connected lights.

The architecture in Figure 2 has two data paths, which we’ll provide details on in the following sections, but here’s a summary:

  1. The “cold” path gathers all incoming data for batch data analysis and historical dashboarding.
  2. The “warm” bidirectional path is for faster, real-time data. It gathers devices’ current state data. This path is used by end-user applications for sending control messages, real-time reporting, or initiating alarms.
Figure 2. Architecture diagram of a building lighting system connected to AWS IoT Core

Figure 2. Architecture diagram of a building lighting system connected to AWS IoT Core

Cold data path

The cold data path gathers all lighting device telemetry data, such as power consumption, operating temperature, health data, etc. to help you understand how the lighting system is functioning.

Building devices can often deliver unstructured, inconsistent, and large volumes of data. AWS IoT Analytics helps clean up this data by applying filters, transformations, and enrichment from other data sources before storing it. By using Amazon Simple Storage Service (Amazon S3), you can analyze your data in different ways. Here we use Amazon Athena and Amazon QuickSight for building operational dashboard visualizations.

Let’s discuss a real-world example. For building lighting systems, understanding your energy consumption is important for evaluating energy and cost efficiency. Data ingested into AWS IoT Core can be stored long term in Amazon S3, making it available for historical reporting. Athena and QuickSight can quickly query this data and build visualizations that show lighting state (on or off) and annual energy consumption over a set period of time. You can also overlay this data with sunrise and sunset data to provide insight into whether you are using your lighting systems efficiently. For example, adjusting the lighting schedule accordingly to the darker winter months versus the brighter summer months.

Warm data path

In the warm data path, AWS IoT Device Shadow service makes the device state available. Shadow updates are forwarded by an AWS IoT rule into downstream services such an AWS IoT Event, which tracks and monitors multiple devices and data points. Then it initiates actions based on specific events. Further, you could build APIs that interact with AWS IoT Device Shadow. In this architecture, we have used AWS AppSync and AWS Lambda to enable building controls via a tenant smartphone application.

Let’s discuss a real-world example. In an office meeting room lighting system, maintaining a certain brightness level is important for health and safety. If that space is unoccupied, you can save money by turning the lighting down or off. AWS IoT Events can take inputs from lumen sensors, lighting systems, and motorized blinds and put them into a detector model. This model calculates and prompts the best action to maintain the room’s brightness throughout the day. If the lumen level drops below a specific brightness threshold in a room, AWS IoT Events could prompt an action to maintain an optimal brightness level in the room. If an occupancy sensor is added to the room, the model can know if someone is in the room and maintain the lighting state. If that person leaves, it will turn off that lighting. The ongoing calculation of state can also evaluate the time of day or weather conditions. It would then select the most economical option for the room, such as opening the window blinds rather than turning on the lighting system.

Conclusion

In this blog, we demonstrated how to collect and aggregate the data produced by on-premises building management platforms. We discussed how augmenting this data with the AWS IoT Core platform allows for development of smart building solutions such as building automation and operational dashboarding. AWS products and services can enable your buildings to be more efficient while and also provide engaging tenant experiences. For more information on how to get started please check out our getting started with AWS IoT Core developer guide.

Building serverless applications with streaming data: Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-2/

Part 1 introduces the Alleycat application that allows bike racers to compete with each other virtually on home exercise bikes. I explain the application’s functionality, how to deploy to your AWS account, and provide an architectural review.

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously.

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. The workload must continuously ingest and buffer this data, then process and analyze the information to provide analytics and leaderboard content for the frontend application.

In this post, I focus on data ingestion. I compare the two different methods used in Alleycat, and discuss other approaches available. This post refers to Amazon Kinesis Data Streams, the AWS SDK, and AWS IoT Core in the solutions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Using AWS IoT Core to ingest streaming data

AWS IoT Core enables publish-subscribe capabilities for large numbers of client applications. Clients can send data to the backend using the AWS IoT Device SDK, which uses the MQTT standard for IoT messaging. After processing, the backend can publish aggregation and status messages back to the frontend via AWS IoT Core. This service fans out the messages to clients using topics.

When using this approach, note the Quality of Service (QoS) options available. By default, the SDK uses QoS level 0, which means the device does not confirm the message is received. This is intended for workloads that can lose messages occasionally without impacting performance. In Alleycat, if performance metrics are sometimes lost, this does not likely impact the overall end user experience.

For workloads requiring higher reliability, use QoS level 1, which causes the SDK to resend the message until an acknowledgement is received. While there is no additional charge for using QoS level 1, it generally increases the number of messages, which increases the overall cost. You are not charged for the PUBACK acknowledgement message – for more details, read more about AWS IoT Core pricing.

Frontend

In this scenario, the Alleycat frontend application is running on a physical exercise bike. The user selects a racer ID and exercise class and chooses Start Race to join the current virtual race for that class.

Start race UI

Every second, the frontend sends a message containing the cadence and resistance metrics and the current second in the race for the local racer. This message is created as a JSON object in the Home.vue component and sent to the ‘alleycat-publish’ topic:

      const message = {
        uuid: uuidv4(),
        event: this.event,
        deviceTimestamp: Date.now(),
        second: this.currentSecond,
        raceId: RACE_ID,
        name: this.racer.name,
        racerId: this.racer.id,
        classId: this.selectedClassId,
        cadence: this.racer.getCurrentCadence(),
        resistance: this.racer.getCurrentResistance
      }

The IoT.vue component contains the logic for this integration and uses the AWS IoT Device SDK to send and receive messages. On startup, the frontend connects to AWS IoT Core and publishes the messages using an MQTT client:

    bus.$on('publish', (data) => {
      console.log('Publish: ', data)
      mqttClient.publish(topics.publish, JSON.stringify(data))
    })

The SDK automatically attempts to retry in the event of a network disconnection and exposes an error handler to allow custom logic if other errors occur.

Backend

The resources used in the backend are defined using the AWS Serverless Application Model (AWS SAM) and configured in the core setup templates:

Reference architecture

Messages are published to topics in AWS IoT Core, which act as channels of interest. The message broker uses topic names and topic filters to route messages between publishers and subscribers. Incoming messages are routed using rules. Alleycat’s IoT rule routes all incoming messages to a Kinesis stream:

  IotTopicRule:
    Type: AWS::IoT::TopicRule
    Properties:
      RuleName: 'alleycatIngest'
      TopicRulePayload:
        RuleDisabled: 'false'
        Sql: "SELECT * FROM 'alleycat-publish'"
        Actions:
        - Kinesis:
            StreamName: 'alleycat'
            PartitionKey: "${timestamp()}"
            RoleArn: !GetAtt IoTKinesisRole.Arn

  IoTKinesisRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - iot.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: IoTKinesisPutPolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action: 'kinesis:PutRecord'
                Resource: !GetAtt KinesisStream.Arn

Using the AWS::IoT::TopicRule resource, you can optionally define an error action. This allows you to store messages in a durable location, such as an Amazon S3 bucket, if an error occurs. Errors can occur if a rule does not have permission to access a destination or throttling occurs in a target.

Rules can route matching messages to up to 10 targets. For debugging purposes, you can also enable Amazon CloudWatch Logs, which can help in troubleshoot failed message deliveries. The AWS IoT Core Message Broker allows up to 20,000 publish requests per second – if you need a higher limit for your workload, submit a request to AWS Support.

Using the AWS SDK to ingest streaming data

The Alleycat frontend creates traffic for a single user but there is also a simulator application that can generate messages for up to 1,000 riders. Instead of routing messages using an MQTT client, the simulator uses the AWS SDK to put messages directly into the Kinesis data stream.

The SDK provides a service interface object for Kinesis and two API methods for putting messages into streams: putRecord and putRecords. The first option accepts only a single message but the second enables batching of up to 500 messages per request. This is the preferred option for adding multiple messages, compared with calling putRecord multiple times.

The putRecords API takes parameters as a JSON array of messages:

const params = {
   StreamName: 'alley-cat',
   [{
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hayden\",\"racerId\":0,\"classId\":1,\"cadence\":79.8,\"resistance\":79}",
      "PartitionKey":"1620824038331"
   },
   {
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hubert\",\"racerId\":1,\"classId\":1,\"cadence\":60.4,\"resistance\":60.6}",
      "PartitionKey":"1620824038331"
   }
]}

The SDK automatically base64 encodes the Data attribute, which in this case is the JSON string output from JSON.stringify. In the JavaScript SDK, the putRecords API can return a promise, allowing the code to await the operation:

const result = await kinesis.putRecords(params).promise()

Shards and partition keys

Kinesis data streams consist of one or more shards, which are sequences of data records with a fixed capacity. Each shard can support up to 1,000 records per second for writes, up to maximum total data write rate of 1MB per second. The total capacity of a stream is the total of its shards.

When you send messages to a stream, the partitionKey attribute determines which shard it is routed to. The example application configures a Kinesis data stream with a single shard so the partitionKey attribute has no effect – all messages are routed to the same shard. However, many production applications have more than one shard and use the partitionKey to assign messages to shards.

The partitionKey is hashed by the Kinesis service to route to a shard. This diagram shows how partitionKey values from data producers are hashed by an MD5 function and mapped to individual shards:

MD5 hash process

While you cannot designate a specific shard ID in a message, you can influence the assignment depending on your choice of partitionKey:

  • Random: Using a randomized value results in random hash so messages are randomly sent to different shards. This effectively load balances messages across all available shards.
  • Time-based: A timestamp value may cause groups of messages sent to a single shard, if the messages arrive at the same time. The identical timestamp results in an identical hash.
  • Application-specific: if Alleycat used the classID as a partitionKey, racers in each class would always be routed to the same shard. This could be useful for downstream aggregation logic but would limit the capacity of messages per classID.

Optimizing capacity in a shard

Each shard can ingest data at a rate of 1 MB per second or 1,000 records per second, whichever limit is reached first. Since the payload maximum is 1MB, this could equate to one 1MB message per second. If the payload is larger, you must divide it into smaller pieces to avoid an error. For 1,000 messages, each payload must be under 1 KB on average to fit within the allowed capacity.

The combination of the two payload limits can result in different capacity profiles for a shard:

Capacity profiles in a shard

  1. The data payloads are evenly sized and use the 1 MB per second capacity.
  2. Data payload sizes vary, so the number of messages that can be packed into 1 MB varies per second.
  3. There are a large number of very messages, consuming all 1,000 messages per second. However, the total data capacity used is significantly less than 1 MB.

In the Alleycat application, the average payload size is around 170 bytes. When producing 1,000 messages a second, the workload is only using about 20% of the 1 MB per second limit. Since PUT payload size is a factor in Kinesis pricing, messages that are much smaller than 25 KB are less cost-efficient. Compare these two messaging patterns for the Alleycat application:

Producer message patterns

  1. In this default mode, a smaller message is published once per second. This reduces overall latency but results in higher overall messaging cost.
  2. The client application batches outgoing messages and sends to Kinesis every 5 seconds. This results in lower cost and better packing of messages, but introduces additional latency.

There is a tradeoff between cost and latency when optimizing a shard’s capacity and the decision depends upon the needs of your workload. If the client buffers messages, this adds latency on the client side. This is acceptable in many workloads that collect metrics for archival or asynchronous reporting purchases. However, for low-latency applications like Alleycat, it provides a better experience for the application user to send messages as soon as they are available.

Conclusion

This post focuses on ingesting data into Kinesis Data Streams. I explain the two approaches used by the Alleycat frontend and the simulator application and highlight other approaches that you can use. I show how messages are routed to shards using partition keys. Finally, I explore additional factors to consider when ingesting data, to improve efficiency and reduce cost.

Part 3 covers using Amazon Kinesis Data Firehose for transforming, aggregating, and loading streaming data into data stores. This is used to provide the historical, second-by-second leaderboard for the frontend application.

For more serverless learning resources, visit Serverless Land.

Router Security

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/02/router-security.html

This report is six months old, and I don’t know anything about the organization that produced it, but it has some alarming data about router security.

Conclusion: Our analysis showed that Linux is the most used OS running on more than 90% of the devices. However, many routers are powered by very old versions of Linux. Most devices are still powered with a 2.6 Linux kernel, which is no longer maintained for many years. This leads to a high number of critical and high severity CVEs affecting these devices.

Since Linux is the most used OS, exploit mitigation techniques could be enabled very easily. Anyhow, they are used quite rarely by most vendors except the NX feature.

A published private key provides no security at all. Nonetheless, all but one vendor spread several private keys in almost all firmware images.

Mirai used hard-coded login credentials to infect thousands of embedded devices in the last years. However, hard-coded credentials can be found in many of the devices and some of them are well known or at least easy crackable.

However, we can tell for sure that the vendors prioritize security differently. AVM does better job than the other vendors regarding most aspects. ASUS and Netgear do a better job in some aspects than D-Link, Linksys, TP-Link and Zyxel.

Additionally, our evaluation showed that large scale automated security analysis of embedded devices is possible today utilizing just open source software. To sum it up, our analysis shows that there is no router without flaws and there is no vendor who does a perfect job regarding all security aspects. Much more effort is needed to make home routers as secure as current desktop of server systems.

One comment on the report:

One-third ship with Linux kernel version 2.6.36 was released in October 2010. You can walk into a store today and buy a brand new router powered by software that’s almost 10 years out of date! This outdated version of the Linux kernel has 233 known security vulnerabilities registered in the Common Vulnerability and Exposures (CVE) database. The average router contains 26 critically-rated security vulnerabilities, according to the study.

We know the reasons for this. Most routers are designed offshore, by third parties, and then private labeled and sold by the vendors you’ve heard of. Engineering teams come together, design and build the router, and then disperse. There’s often no one around to write patches, and most of the time router firmware isn’t even patchable. The way to update your home router is to throw it away and buy a new one.

And this paper demonstrates that even the new ones aren’t likely to be secure.

Chinese Supply-Chain Attack on Computer Systems

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/02/chinese-supply-chain-attack-on-computer-systems.html

Bloomberg News has a major story about the Chinese hacking computer motherboards made by Supermicro, Levono, and others. It’s been going on since at least 2008. The US government has known about it for almost as long, and has tried to keep the attack secret:

China’s exploitation of products made by Supermicro, as the U.S. company is known, has been under federal scrutiny for much of the past decade, according to 14 former law enforcement and intelligence officials familiar with the matter. That included an FBI counterintelligence investigation that began around 2012, when agents started monitoring the communications of a small group of Supermicro workers, using warrants obtained under the Foreign Intelligence Surveillance Act, or FISA, according to five of the officials.

There’s lots of detail in the article, and I recommend that you read it through.

This is a follow on, with a lot more detail, to a story Bloomberg reported on in fall 2018. I didn’t believe the story back then, writing:

I don’t think it’s real. Yes, it’s plausible. But first of all, if someone actually surreptitiously put malicious chips onto motherboards en masse, we would have seen a photo of the alleged chip already. And second, there are easier, more effective, and less obvious ways of adding backdoors to networking equipment.

I seem to have been wrong. From the current Bloomberg story:

Mike Quinn, a cybersecurity executive who served in senior roles at Cisco Systems Inc. and Microsoft Corp., said he was briefed about added chips on Supermicro motherboards by officials from the U.S. Air Force. Quinn was working for a company that was a potential bidder for Air Force contracts, and the officials wanted to ensure that any work would not include Supermicro equipment, he said. Bloomberg agreed not to specify when Quinn received the briefing or identify the company he was working for at the time.

“This wasn’t a case of a guy stealing a board and soldering a chip on in his hotel room; it was architected onto the final device,” Quinn said, recalling details provided by Air Force officials. The chip “was blended into the trace on a multilayered board,” he said.

“The attackers knew how that board was designed so it would pass” quality assurance tests, Quinn said.

Supply-chain attacks are the flavor of the moment, it seems. But they’re serious, and very hard to defend against in our deeply international IT industry. (I have repeatedly called this an “insurmountable problem.”) Here’s me in 2018:

Supply-chain security is an incredibly complex problem. US-only design and manufacturing isn’t an option; the tech world is far too internationally interdependent for that. We can’t trust anyone, yet we have no choice but to trust everyone. Our phones, computers, software and cloud systems are touched by citizens of dozens of different countries, any one of whom could subvert them at the demand of their government.

We need some fundamental security research here. I wrote this in 2019:

The other solution is to build a secure system, even though any of its parts can be subverted. This is what the former Deputy Director of National Intelligence Sue Gordon meant in April when she said about 5G, “You have to presume a dirty network.” Or more precisely, can we solve this by building trustworthy systems out of untrustworthy parts?

It sounds ridiculous on its face, but the Internet itself was a solution to a similar problem: a reliable network built out of unreliable parts. This was the result of decades of research. That research continues today, and it’s how we can have highly resilient distributed systems like Google’s network even though none of the individual components are particularly good. It’s also the philosophy behind much of the cybersecurity industry today: systems watching one another, looking for vulnerabilities and signs of attack.

It seems that supply-chain attacks are constantly in the news right now. That’s good. They’ve been a serious problem for a long time, and we need to take the threat seriously. For further reading, I strongly recommend this Atlantic Council report from last summer: “Breaking trust: Shades of crisis across an insecure software supply chain.

Presidential Cybersecurity and Pelotons

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/02/presidential-cybersecurity-and-pelotons.html

President Biden wants his Peloton in the White House. For those who have missed the hype, it’s an Internet-connected stationary bicycle. It has a screen, a camera, and a microphone. You can take live classes online, work out with your friends, or join the exercise social network. And all of that is a security risk, especially if you are the president of the United States.

Any computer brings with it the risk of hacking. This is true of our computers and phones, and it’s also true about all of the Internet-of-Things devices that are increasingly part of our lives. These large and small appliances, cars, medical devices, toys and — yes — exercise machines are all computers at their core, and they’re all just as vulnerable. Presidents face special risks when it comes to the IoT, but Biden has the NSA to help him handle them.

Not everyone is so lucky, and the rest of us need something more structural.

US presidents have long tussled with their security advisers over tech. The NSA often customizes devices, but that means eliminating features. In 2010, President Barack Obama complained that his presidential BlackBerry device was “no fun” because only ten people were allowed to contact him on it. In 2013, security prevented him from getting an iPhone. When he finally got an upgrade to his BlackBerry in 2016, he complained that his new “secure” phone couldn’t take pictures, send texts, or play music. His “hardened” iPad to read daily intelligence briefings was presumably similarly handicapped. We don’t know what the NSA did to these devices, but they certainly modified the software and physically removed the cameras and microphones — and possibly the wireless Internet connection.

President Donald Trump resisted efforts to secure his phones. We don’t know the details, only that they were regularly replaced, with the government effectively treating them as burner phones.

The risks are serious. We know that the Russians and the Chinese were eavesdropping on Trump’s phones. Hackers can remotely turn on microphones and cameras, listening in on conversations. They can grab copies of any documents on the device. They can also use those devices to further infiltrate government networks, maybe even jumping onto classified networks that the devices connect to. If the devices have physical capabilities, those can be hacked as well. In 2007, the wireless features of Vice President Richard B. Cheney’s pacemaker were disabled out of fears that it could be hacked to assassinate him. In 1999, the NSA banned Furbies from its offices, mistakenly believing that they could listen and learn.

Physically removing features and components works, but the results are increasingly unacceptable. The NSA could take Biden’s Peloton and rip out the camera, microphone, and Internet connection, and that would make it secure — but then it would just be a normal (albeit expensive) stationary bike. Maybe Biden wouldn’t accept that, and he’d demand that the NSA do even more work to customize and secure the Peloton part of the bicycle. Maybe Biden’s security agents could isolate his Peloton in a specially shielded room where it couldn’t infect other computers, and warn him not to discuss national security in its presence.

This might work, but it certainly doesn’t scale. As president, Biden can direct substantial resources to solving his cybersecurity problems. The real issue is what everyone else should do. The president of the United States is a singular espionage target, but so are members of his staff and other administration officials.

Members of Congress are targets, as are governors and mayors, police officers and judges, CEOs and directors of human rights organizations, nuclear power plant operators, and election officials. All of these people have smartphones, tablets, and laptops. Many have Internet-connected cars and appliances, vacuums, bikes, and doorbells. Every one of those devices is a potential security risk, and all of those people are potential national security targets. But none of those people will get their Internet-connected devices customized by the NSA.

That is the real cybersecurity issue. Internet connectivity brings with it features we like. In our cars, it means real-time navigation, entertainment options, automatic diagnostics, and more. In a Peloton, it means everything that makes it more than a stationary bike. In a pacemaker, it means continuous monitoring by your doctor — and possibly your life saved as a result. In an iPhone or iPad, it means…well, everything. We can search for older, non-networked versions of some of these devices, or the NSA can disable connectivity for the privileged few of us. But the result is the same: in Obama’s words, “no fun.”

And unconnected options are increasingly hard to find. In 2016, I tried to find a new car that didn’t come with Internet connectivity, but I had to give up: there were no options to omit that in the class of car I wanted. Similarly, it’s getting harder to find major appliances without a wireless connection. As the price of connectivity continues to drop, more and more things will only be available Internet-enabled.

Internet security is national security — not because the president is personally vulnerable but because we are all part of a single network. Depending on who we are and what we do, we will make different trade-offs between security and fun. But we all deserve better options.

Regulations that force manufacturers to provide better security for all of us are the only way to do that. We need minimum security standards for computers of all kinds. We need transparency laws that give all of us, from the president on down, sufficient information to make our own security trade-offs. And we need liability laws that hold companies liable when they misrepresent the security of their products and services.

I’m not worried about Biden. He and his staff will figure out how to balance his exercise needs with the national security needs of the country. Sometimes the solutions are weirdly customized, such as the anti-eavesdropping tent that Obama used while traveling. I am much more worried about the political activists, journalists, human rights workers, and oppressed minorities around the world who don’t have the money or expertise to secure their technology, or the information that would give them the ability to make informed decisions on which technologies to choose.

This essay previously appeared in the Washington Post.

Building a Controlled Environment Agriculture Platform

Post Syndicated from Ashu Joshi original https://aws.amazon.com/blogs/architecture/building-a-controlled-environment-agriculture-platform/

This post was co-written by Michael Wirig, Software Engineering Manager at Grōv Technologies.

A substantial percentage of the world’s habitable land is used for livestock farming for dairy and meat production. The dairy industry has leveraged technology to gain insights that have led to drastic improvements and are continuing to accelerate. A gallon of milk in 2017 involved 30% less water, 21% less land, a 19% smaller carbon footprint, and 20% less manure than it did in 2007 (US Dairy, 2019). By focusing on smarter water usage and sustainable land usage, livestock farming can grow to provide sustainable and nutrient-dense food for consumers and livestock alike.

Grōv Technologies (Grōv) has pioneered the Olympus Tower Farm, a fully automated Controlled Environment Agriculture (CEA) system. Unique amongst vertical farming startups, Grōv is growing cattle feed to improve that sustainable use of land for livestock farming while increasing the economic margins for dairy and beef producers.

The challenges of CEA

The set of growing conditions for a CEA is called a “recipe,” which is a combination of ingredients like temperature, humidity, light, carbon dioxide levels, and water. The optimal recipe is dynamic and is sensitive to its ingredients. Crops must be monitored in near-real time, and CEAs should be able to self-correct in order to maintain the recipe. To build a system with these capabilities requires answers to the following questions:

  • What parameters are needed to measure for indoor cattle feed production?
  • What sensors enable the accuracy and price trade-offs at scale?
  • Where do you place the sensors to ensure a consistent crop?
  • How do you correlate the data from sensors to the nutrient value?

To progress from a passively monitored system to a self-correcting, autonomous one, the CEA platform also needs to address:

  • How to maintain optimum crop conditions
  • How the system can learn and adapt to new seed varieties
  • How to communicate key business drivers such as yield and dry matter percentage

Grōv partnered with AWS Professional Services (AWS ProServe) to build a digital CEA platform addressing the challenges posed above.

Olympus Tower - Grov Technologies

Tower automation and edge platform

The Olympus Tower is instrumented for measuring recipe ingredients by combining the mechanical, electrical, and domain expertise of the Grōv team with the IoT edge and sensor expertise of the AWS ProServe team. The teams identified a primary set of features such as height, weight, and evenness of the growth to be measured at multiple stages within the Tower. Sensors were also added to measure secondary features such as water level, water pH, temperature, humidity, and carbon dioxide.

The teams designed and developed a purpose-built modular and industrial sensor station. Each sensor station has sensors for direct measurement of the features identified. The sensor stations are extended to support indirect measurement of features using a combination of Computer Vision and Machine Learning (CV/ML).

The trays with the growing cattle feed circulate through the Olympus Tower. A growth cycle starts on a tray with seeding, circulates through the tower over the cycle, and returns to the starting position to be harvested. The sensor station at the seeding location on the Olympus Tower tags each new growth cycle in a tray with a unique “Grow ID.” As trays pass by, each sensor station in the Tower collects the feature data. The firmware, jointly developed for the sensor station, uses AWS IoT SDK to stream the sensor data along with the Grow ID and metadata that’s specific to the sensor station. This information is sent every five minutes to an on-site edge gateway powered by AWS IoT Greengrass. Dedicated AWS Lambda functions manage the lifecycle of the Grow IDs and the sensor data processing on the edge.

The Grōv team developed AWS Greengrass Lambda functions running at the edge to ingest critical metrics from the operation automation software running the Olympus Towers. This information provides the ability to not just monitor the operational efficiency, but to provide the hooks to control the feedback loop.

The two sources of data were augmented with site-level data by installing sensor stations at the building level or site level to capture environmental data such as weather and energy consumption of the Towers.

All three sources of data are streamed to AWS IoT Greengrass and are processed by AWS Lambda functions. The edge software also fuses the data and correlates all categories of data together. This enables two major actions for the Grōv team – operational capability in real-time at the edge and enhanced data streamed into the cloud.

Grov Technologies - Architecture

Cloud pipeline/platform: analytics and visualization

As the data is streamed to AWS IoT Core via AWS IoT Greengrass. AWS IoT rules are used to route ingested data to store in Amazon Simple Sotrage Service (Amazon S3) and Amazon DynamoDB. The data pipeline also includes Amazon Kinesis Data Streams for batching and additional processing on the incoming data.

A ReactJS-based dashboard application is powered using Amazon API Gateway and AWS Lambda functions to report relevant metrics such as daily yield and machine uptime.

A data pipeline is deployed to analyze data using Amazon QuickSight. AWS Glue is used to create a dataset from the data stored in Amazon S3. Amazon Athena is used to query the dataset to make it available to Amazon QuickSight. This provides the extended Grōv tech team of research scientists the ability to perform a series of what-if analyses on the data coming in from the Tower Systems beyond what is available in the react-based dashboard.

Data pipeline - Grov Technologies

Completing the data-driven loop

Now that the data has been collected from all sources and stored it in a data lake architecture, the Grōv CEA platform established a strong foundation for harnessing the insights and delivering the customer outcomes using machine learning.

The integrated and fused data from the edge (sourced from the Olympus Tower instrumentation, Olympus automation software data, and site-level data) is co-related to the lab analysis performed by Grōv Research Center (GRC). Harvest samples are routinely collected and sent to the lab, which performs wet chemistry and microbiological analysis. Trays sent as samples to the lab are associated with the results of the analysis with the sensor data by corresponding Grow IDs. This serves as a mechanism for labeling and correlating the recipe data with the parameters used by dairy and beef producers – dry matter percentage, micro and macronutrients, and the presence of myco-toxins.

Grōv has chosen Amazon SageMaker to build a machine learning pipeline on its comprehensive data set, which will enable fine tuning the growing protocols in near real-time. Historical data collection unlocks machine learning use cases for future detection of anomalous sensors readings and sensor health monitoring, as well.

Because the solution is flexible, the Grōv team plans to integrate data from animal studies on their health and feed efficiency into the CEA platform. Machine learning on the data from animal studies will enhance the tuning of recipe ingredients that impact the animals’ health. This will give the farmer an unprecedented view of the impact of feed nutrition on the end product and consumer.

Conclusion

Grōv Technologies and AWS ProServe have built a strong foundation for an extensible and scalable architecture for a CEA platform that will nourish animals for better health and yield, produce healthier foods and to enable continued research into dairy production, rumination and animal health to empower sustainable farming practices.

Announcing AWS IoT Greengrass 2.0 – With an Open Source Edge Runtime and New Developer Capabilities

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/announcing-aws-iot-greengrass-2-0-with-an-open-source-edge-runtime-and-new-developer-capabilities/

I am happy to announce AWS IoT Greengrass 2.0, a new version of AWS IoT Greengrass that makes it easy for device builders to build, deploy, and manage intelligent device software. AWS IoT Greengrass 2.0 provides an open source edge runtime, a rich set of pre-built software components, tools for local software development, and new features for managing software on large fleets of devices.

 

AWS IoT Greengrass 2.0 edge runtime is now open source under an Apache 2.0 license, and available on Github. Access to the source code allows you to more easily integrate your applications, troubleshoot problems, and build more reliable and performant applications that use AWS IoT Greengrass.

You can add or remove pre-built software components based on your IoT use case and your device’s CPU and memory resources. For example, you can choose to include pre-built AWS IoT Greengrass components such as stream manager only when you need to process data streams with your application, or machine learning components only when you want to perform machine learning inference locally on your devices.

The AWS IoT Greengrass IoT Greengrass 2.0 includes a new command-line interface (CLI) that allows you to locally develop and debug applications on your device. In addition, there is a new local debug console that helps you visually debug applications on your device. With these new capabilities, you can rapidly develop and debug code on a test device before using the cloud to deploy to your production devices.

AWS IoT Greengrass 2.0 is also integrated with AWS IoT thing groups, enabling you to easily organize your devices in groups and manage application deployments across your devices with features to control rollout rates, timeouts, and rollbacks.

AWS IoT Greengrass 2.0 – Getting Started
Device builders can use AWS IoT Greengrass 2.0 by going to the AWS IoT Greengrass console where you can find a download and install command that you run on your device. Once the installer is downloaded to the device, you can use it to install Greengrass software with all essential features, register the device as an AWS IoT Thing, and create a simple “hello world” software component in less than 10 minutes.

To get started in the AWS IoT Greengrass console, you first register a test device by clicking Set up core device. You assign the name and group of your core device. To deploy to only the core device, select No group. In the next step, install the AWS IoT Greengrass Core software in your device.

When the installer completes, you can find your device in the list of AWS IoT Greengrass Core devices on the Core devices page.

AWS IoT Greengrass components enable you to develop and deploy software to your AWS IoT Greengrass Core devices. You can write your application functionality and bundle it as a private component for deployment. AWS IoT Greengrass also provides public components, which provide pre-built software for common use cases that you can deploy to your devices as you develop your device software. When you finish developing the software for your component, you can register it with AWS IoT Greengrass. Then, you can deploy and run the component on your AWS IoT Greengrass Core devices.

 

To create a component, click the Create component button on the Components page. You can use a recipe or import an AWS Lambda function. The component recipe is a YAML or JSON file that defines the component’s details, dependencies, compatibility, and lifecycle. To learn about the specifications, visit the recipe reference guide.

Here is an example of a YAML recipe.

When you finish developing your component, you can add it to a deployment configuration to deploy to one or more core devices. To create a new deployment or configure the components to deploy to core devices, click the Create button on the Deployments page. You can deploy to a core device or a thing group as a target, and select the components to deploy. The deployment includes the dependencies for each component that you select.

 

You can edit the version and parameters of selected components and advanced settings such as the rollout configuration, which defines the rate at which the configuration deploys to the target devices; timeout configuration, which defines the duration that each device has to apply the deployment; or cancel configuration, which defines when to automatically stop the deployment.

Moving to AWS IoT Greengrass 2.0
Existing devices running AWS IoT Greengrass 1.x will continue to run without any changes. If you want to take advantage of new AWS IoT Greengrass 2.0 features, you will need to move your existing AWS IoT Greengrass 1.x devices and workloads to AWS IoT Greengrass 2.0. To learn how to do this, visit the migration guide.

After you move your 1.x applications over, you can start adding components to your applications using new version 2 features, while leaving your version 1 code as-is until you decide to update them.

AWS IoT Greengrass 2.0 Partners
At launch, industry-leading partners NVIDIA and NXP have qualified a number of their devices for AWS IoT Greengrass 2.0:

See all partner device listings in the AWS Partner Device Catalog. To learn about getting your device qualified, visit the AWS Device Qualification Program.

Available Now
AWS IoT Greengrass 2.0 is available today. Please see the AWS Region table for all the regions where AWS IoT Greengrass is available. For more information, see the developer guide.

Starting today, to help you evaluate, test, and develop with this new release of AWS IoT Greengrass, the first 1,000 devices in your account will not incur any AWS IoT Greengrass charges until December 31, 2021. For pricing information, check out the AWS IoT Greengrass pricing page.

Give it a try, and please send us feedback through your usual AWS Support contacts or the AWS forum for AWS IoT Greengrass.

Learn all the details about AWS IoT Greengrass 2.0 and get started with the new version today.

Channy

New – AWS IoT Core for LoRaWAN to Connect, Manage, and Secure LoRaWAN Devices at Scale

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-aws-iot-core-for-lorawan-to-connect-manage-and-secure-lorawan-devices-at-scale/

Today, I am happy to announce AWS IoT Core for LoRaWAN, a new fully-managed feature that allows AWS IoT Core customers to connect and manage wireless devices that use low-power long-range wide area network (LoRaWAN) connectivity with the AWS Cloud.

Using AWS IoT Core for LoRaWAN, customers can now set up a private LoRaWAN network by connecting their own LoRaWAN devices and gateways to the AWS Cloud – without developing or operating a LoRaWAN Network Server (LNS) by themselves. The LNS is required to manage LoRaWAN devices and gateways’ connection to the cloud; gateways serve as a bridge and carry device data to and from the LNS, usually over Wi-Fi or Ethernet.

This allows customers to eliminate the undifferentiated work and operational burden of managing an LNS, and enables them to easily and quickly connect and secure LoRaWAN device fleets at scale.

Combined with the long range and deep in-building coverage provided by LoRa technology, AWS IoT Core now enables customers to accelerate IoT application development using AWS services and acting on the data generated easily from connected LoRaWAN devices.

Customers – mostly enterprises – need to develop IoT applications using devices that transmit data over long range (1-3 miles of urban coverage or up to 10 miles for line-of-sight) or through the walls and floors of buildings, for example for real-time asset tracking at airports, remote temperature monitoring in buildings, or predictive maintenance of industrial equipment. Such applications also require devices to be optimized for low-power consumption, so that batteries can last several years without replacement, thus making the implementation cost-effective. Given the extended coverage of LoRaWAN connectivity, it is attractive to enterprises for these use cases, but setting up LoRaWAN connectivity in a privately managed site requires customers to operate an LNS.

With AWS IoT Core for LoRaWAN, you can connect LoRaWAN devices and gateways to the cloud with a few simple steps in the AWS IoT Management Console, thus speeding up the network setup time, and connect off-the-shelf LoRaWAN devices, without any requirement to modify embedded software, for a plug and play experience.

AWS IoT Core for LoRaWAN – Getting Started
Getting started with a LoRaWAN network setup is easy. You can find AWS IoT Core for LoRaWAN qualified gateways and developer kits from the AWS Partner Device Catalog. AWS qualified gateways and developer kits are pre-tested and come with a step by step guide from the manufacturer on how to connect it with AWS IoT Core for LoRaWAN.

With AWS IoT Core console, you can register the gateways by providing a gateway’s unique identifier (provided by the gateway vendor) and selecting LoRa frequency band. For registering devices, you can input device credentials (identifiers and security keys provided by the device vendor) on the console.

Each device has a Device Profile that specifies the device capabilities and boot parameters the LNS requires to set up LoRaWAN radio access service. Using the console, you can select a pre-populated Device Profile or create a new one.

A destination automatically routes messages from LoRaWAN devices to AWS IoT Rules Engine. Once a destination is created, you can use it to map multiple LoRaWAN devices to the same IoT rule. You can write rules using simple SQL queries, to transform and act on the device data, like converting data from proprietary binary to JSON format, raising alerts, or routing it to other AWS services like Amazon Simple Storage Service (S3). From the console, you can also query metrics for connected devices and gateways to troubleshoot connectivity issues.

Available Now
AWS IoT Core for LoRaWAN is available today in US East (N. Virginia) and Europe (Ireland) Regions. With pay-as-you-go pricing and no monthly commitments, you can connect and scale LoRaWAN device fleets reliably, and build applications with AWS services quickly and efficiently. For more information, see the pricing page.

To get started, buy an AWS qualified LoRaWAN developer kit and and launch Getting Started experience in the AWS Management Console. To learn more, visit the developer guide. Give this a try, and please send us feedback either through your usual AWS Support contacts or the AWS forum for AWS IoT.

Learn all the details about AWS IoT Core for LoRaWAN and get started with the new feature today.

Channy

Amazon SageMaker Edge Manager Simplifies Operating Machine Learning Models on Edge Devices

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-edge-manager-simplifies-operating-machine-learning-models-on-edge-devices/

Today, I’m extremely happy to announce Amazon SageMaker Edge Manager, a new capability of Amazon SageMaker that makes it easier to optimize, secure, monitor, and maintain machine learning models on a fleet of edge devices.

Edge computing is certainly one of the most exciting developments in information technology. Indeed, thanks to continued advances in compute, storage, networking, and battery technology, organizations routinely deploy large numbers of embedded devices anywhere on the planet for a wide range of industry applications: manufacturing, energy, agriculture, healthcare, and more. Ranging from simple sensors to large industrial machines, the devices have a common purpose: capture data, analyze it, and act on it, for example send an alert if an unwanted condition is detected.

As machine learning (ML) demonstrated its ability to solve a wide range of business problems, customers tried to apply it to edge applications, training models in the cloud and deploying them at the edge in an effort to extract deeper insights from local data. However, given the remote and constrained nature of edge devices, deploying and managing models at the edge is often quite difficult.

For example, a complex model can be too large to fit, forcing customers to settle for a smaller and less accurate model. Also, predicting with several models on the same device (say, to detect different types of anomalies) may require additional code to load and unload models on demand, in order to conserve hardware resources. Finally, monitoring prediction quality is a major concern, as the real world will always be more complex and unpredictable than any training set can anticipate.

Customers asked us to help them solve these challenges, and we got to work.

Announcing Amazon SageMaker Edge Manager
Amazon SageMaker Edge Manager makes it easy for ML edge developers to use the same familiar tools in the cloud or on edge devices. It reduces the time and effort required to get models to production, while continuously monitoring and improving model quality across your device fleet.

Starting from a model that you trained or imported in Amazon SageMaker, SageMaker Edge Manager first optimizes it for your hardware platform using Amazon SageMaker Neo. Launched two years ago, Neo converts models into an efficient common format which is executed on the device by a low footprint runtime. Neo currently supports devices based on chips manufactured by Ambarella, ARM, Intel, NVIDIA, NXP, Qualcomm, TI, and Xilinx.

Then, SageMaker Edge Manager packages the model, and stores it in Amazon Simple Storage Service (S3), where it can be deployed to your devices. In fact, you can deploy multiple models, loading and predicting with a runtime optimized for your hardware of choice.

On-device models are managed by the SageMaker Edge Manager Manager Agent, which communicates with the AWS Cloud for model deployment, and with your application for model management. Indeed, you can integrate this agent with your application, so that it may automatically load and unload models according to your prediction requests. This enables a variety of scenarios, such as freeing all resources for a large model whenever needed, or working with a collection of smaller models that cohabit in memory.

Lenovo, the #1 global PC maker, recently incorporated Amazon SageMaker into its latest predictive maintenance offering. Igor Bergman, Lenovo Vice President, Cloud & Software of PCs and Smart Devices, told us: “At Lenovo, we’re more than a hardware provider and are committed to being a trusted partner in transforming customers’ device experience and delivering on their business goals. Lenovo Device Intelligence is a great example of how we’re doing this with the power of machine learning, enhanced by Amazon SageMaker. With Lenovo Device Intelligence, IT administrators can proactively diagnose PC issues and help predict potential system failures before they occur, helping to decrease downtime and increase employee productivity. By incorporating Amazon SageMaker Neo, we’ve already seen a substantial improvement in the execution of our on-device predictive models – an encouraging sign for the new Amazon SageMaker Edge Manager that will be added in the coming weeks. SageMaker Edge Manager will help eliminate the manual effort required to optimize, monitor, and continuously improve the models after deployment. With it, we expect our models will run faster and consume less memory than with other comparable machine learning platforms. As we extend AI to new applications across the Lenovo services portfolio, we will continue to require a high-performance pipeline that is flexible and scalable both in the cloud and on millions of edge devices. That’s why we selected the Amazon SageMaker platform. With its rich edge-to-cloud and CI/CD workflow capabilities, we can effectively bring our machine learning models to any device workflow for much higher productivity.

Getting Started
As you can see, SageMaker Edge Manager makes it easier to work with ML models deployed on edge devices. It’s available today in the US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), Europe (Frankfurt), and Asia Pacific (Tokyo) regions.

Sample notebooks are available to get you started right away. Give them a try, and let us know what you think.

We’re always looking forward to your feedback, either through your usual AWS support contacts, or on the AWS Forum for SageMaker.

– Julien

New – Amazon Lookout for Equipment Analyzes Sensor Data to Help Detect Equipment Failure

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-amazon-lookout-for-equipment-analyzes-sensor-data-to-help-detect-equipment-failure/

Companies that operate industrial equipment are constantly working to improve operational efficiency and avoid unplanned downtime due to component failure. They invest heavily and repeatedly in physical sensors (tags), data connectivity, data storage, and building dashboards over the years to monitor the condition of their equipment and get real-time alerts. The primary data analysis methods are single-variable threshold and physics-based modeling approaches, and while these methods are effective in detecting specific failure types and operating conditions, they can often miss important information detected by deriving multivariate relationships for each piece of equipment.

With machine learning, more powerful technologies have become available that can provide data-driven models that learn from an equipment’s historical data. However, implementing such machine learning solutions is time-consuming and expensive owing to capital investment and training of engineers.

Today, we are happy to announce Amazon Lookout for Equipment, an API-based machine learning (ML) service that detects abnormal equipment behavior. With Lookout for Equipment, customers can bring in historical time series data and past maintenance events generated from industrial equipment that can have up to 300 data tags from components such as sensors and actuators per model. Lookout for Equipment automatically tests the possible combinations and builds an optimal machine learning model to learn the normal behavior of the equipment. Engineers don’t need machine learning expertise and can easily deploy models for real-time processing in the cloud.

Customers can then easily perform ML inference to detect abnormal behavior of the equipment. The results can be integrated into existing monitoring software or AWS IoT SiteWise Monitor to visualize the real-time output or to receive alerts if an asset tends toward anomalous conditions.

How Lookout for Equipment Works
Lookout for Equipment reads directly from Amazon S3 buckets. Customers can publish their industrial data in S3 and leverage Lookout for Equipment for model development. A user determines the value or time period to be used for training and assigns an appropriate label. Given this information, Lookout for Equipment launches a task to learn and creates the best ML model for each customer.

Because Lookout for Equipment is an automated machine learning tool, it gets smarter over time as users use Lookout for Equipment to retrain their models with new data. This is useful for model re-creation when new invisible failures occur, or when the model drifts over time. Once the model is complete and can be inferred, Lookout for Equipment provides real-time analysis.

With the equipment data being published to S3, the user can scheduled inference that ranges from 5 minutes to one hour. When the user data arrives in S3, Lookout for Equipment fetches the new data on the desired schedule, performs data inference, and stores the results in another S3 bucket.

Set up Lookout for Equipment with these simply steps:

  1. Upload data to S3 buckets
  2. Create datasets
  3. Ingest data
  4. Create a model
  5. Schedule inference (if you need real-time analysis)

1. Upload data
You need to upload tag data from equipment to any S3 bucket.

2. Create Datasets

Select Create dataset, and set Dataset name, and set Data Schema. Data schema is like a data design document that defines the data to be fed in later. Then select Create.

creating datasets console

3. Ingest data
After a dataset is created, the next step is to ingest data. If you are familiar with Amazon Personalize or Amazon Forecast, doesn’t this screen feel familiar? Yes, Lookout for Equipment is as easy to use as those are.

Select Ingest data.

Ingesting data consoleSpecify the S3 bucket location where you uploaded your data, and an IAM role. The IAM role has to have a trust relationship to “lookoutequipment.amazonaws.com” You can use the following policy file for the test.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lookoutequipment.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The data format in the S3 bucket has to match the Data Schema you set up in step 2. Please check our technical documents for more detail. Ingesting data takes a few minutes to tens of minutes depending on your data volume.

4. Create a model
After data ingest is completed, you can train your own ML model now. Select Create new model. Fields show us a list of fields in the ingested data. By default, no field is selected. You can select fields you want Lookout for Equipment to learn. Lookout for Equipment automatically finds and trains correlations from multiple specified fields and creates a model.

Image illustrates setting up fields.

If you are sure that your data has some unusual data included, you can optionally set the windows to exclude that data.

setting up maintenance windowOptionally, you can divide ingested data for training and then for evaluation. The data specified during the evaluation period is checked compared to the trained model.

setting up evaluation window

Once you select Create, Lookout for Equipment starts to train your model. This process takes minutes to hours depending on your data volume. After training is finished, you can evaluate your model with the evaluation period data.

model performance console

5. Schedule Inference
Now it is time to analyze your real time data. Select Schedule Inference, and set up your S3 buckets for input.

setting up input S3 bucket

You can also set Data upload frequency, which is actually the same as inferencing frequency, and Offset delay time. Then, you need to set up Output data as Lookout for Equipment outputs the result of inference.

setting up inferenced output S3 bucket

Amazon Lookout for Equipment is In Preview Today
Amazon Lookout for Equipment is in preview today at US East (N. Virginia), Asia Pacific (Seoul), and Europe (Ireland) and you can see the documentation here.

– Kame