Tag Archives: artificial intelligence

Raising code quality for Python applications using Amazon CodeGuru

Post Syndicated from Ran Fu original https://aws.amazon.com/blogs/devops/raising-code-quality-for-python-applications-using-amazon-codeguru/

We are pleased to announce the launch of Python support for Amazon CodeGuru, a service for automated code reviews and application performance recommendations. CodeGuru is powered by program analysis and machine learning, and trained on best practices and hard-learned lessons across millions of code reviews and thousands of applications profiled on open-source projects and internally at Amazon.

Amazon CodeGuru has two services:

  • Amazon CodeGuru Reviewer – Helps you improve source code quality by detecting hard-to-find defects during application development and recommending how to remediate them.
  • Amazon CodeGuru Profiler – Helps you find the most expensive lines of code, helps reduce your infrastructure cost, and fine-tunes your application performance.

The launch of Python support extends CodeGuru beyond its original Java support. Python is a widely used language for various use cases, including web app development and DevOps. Python’s growth in data analysis and machine learning areas is driven by its rich frameworks and libraries. In this post, we discuss how to use CodeGuru Reviewer and Profiler to improve your code quality for Python applications.

CodeGuru Reviewer for Python

CodeGuru Reviewer now allows you to analyze your Python code through pull requests and full repository analysis. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. We analyzed large code corpuses and Python documentation to source hard-to-find coding issues and trained our detectors to provide best practice recommendations. We expect such recommendations to benefit beginners as well as expert Python programmers.

CodeGuru Reviewer generates recommendations in the following categories:

  • AWS SDK APIs best practices
  • Data structures and control flow, including exception handling
  • Resource leaks
  • Secure coding practices to protect from potential shell injections

In the following sections, we provide real-world examples of bugs that can be detected in each of the categories:

AWS SDK API best practices

AWS has hundreds of services and thousands of APIs. Developers can now benefit from CodeGuru Reviewer recommendations related to AWS APIs. AWS recommendations in CodeGuru Reviewer cover a wide range of scenarios such as detecting outdated or deprecated APIs, warning about API misuse, authentication and exception scenarios, and efficient API alternatives.

Consider the pagination trait, implemented by over 1,000 APIs from more than 150 AWS services. The trait is commonly used when the response object is too large to return in a single response. To get the complete set of results, iterated calls to the API are required, until the last page is reached. If developers were not aware of this, they would write the code as the following (this example is patterned after actual code):

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName=“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

Here the scan API is used to read items from one Amazon DynamoDB table and the put_item API to save them to another DynamoDB table. The scan API implements the Pagination trait. However, the developer missed iterating on the results beyond the first scan, leading to only partial copying of data.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends on the need for pagination

The developer fixed the code based on this recommendation and added complete handling of paginated results by checking the LastEvaluatedKey value in the response object of the paginated API scan as follows:

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName==“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    # Keeps scanning util LastEvaluatedKey is null
    while "LastEvaluatedKey" in response:
        response = source_ddb.scan(
            TableName="table1",
            ExclusiveStartKey=response["LastEvaluatedKey"]
        )
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

CodeGuru Reviewer recommendation is rich and offers multiple options for implementing Paginated scan. We can also initialize the ExclusiveStartKey value to None and iteratively update it based on the LastEvaluatedKey value obtained from the scan response object in a loop. This fix below conforms to the usage mentioned in the official documentation.

def sync_ddb_table(source_ddb, destination_ddb):
    table = source_ddb.Table(“table1”)
    scan_kwargs = {
                  …
    }
    done = False
    start_key = None
    while not done:
        if start_key:
            scan_kwargs['ExclusiveStartKey'] = start_key
        response = table.scan(**scan_kwargs)
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
        start_key = response.get('LastEvaluatedKey', None)
        done = start_key is None

Data structures and control flow

Python’s coding style is different from other languages. For code that does not conform to Python idioms, CodeGuru Reviewer provides a variety of suggestions for efficient and correct handling of data structures and control flow in the Python 3 standard library:

  • Using DefaultDict for compact handling of missing dictionary keys over using the setDefault() API or dealing with KeyError exception
  • Using a subprocess module over outdated APIs for subprocess handling
  • Detecting improper exception handling such as catching and passing generic exceptions that can hide latent issues.
  • Detecting simultaneous iteration and modification to loops that might lead to unexpected bugs because the iterator expression is only evaluated one time and does not account for subsequent index changes.

The following code is a specific example that can confuse novice developers.

def list_sns(region, creds, sns_topics=[]):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics
  
def process():
    ...
    for region, creds in jobs["auth_config"]:
        arns = list_sns(region, creds)
        ... 

The process() method iterates over different AWS Regions and collects Regional ARNs by calling the list_sns() method. The developer might expect that each call to list_sns() with a Region parameter returns only the corresponding Regional ARNs. However, the preceding code actually leaks the ARNs from prior calls to subsequent Regions. This happens due to an idiosyncrasy of Python relating to the use of mutable objects as default argument values. Python default value are created exactly one time, and if that object is mutated, subsequent references to the object refer to the mutated value instead of re-initialization.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends about initializing a value for mutable objects

The developer accepted the recommendation and issued the below fix.

def list_sns(region, creds, sns_topics=None):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    if sns_topics is None: 
        sns_topics = [] 
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics

Resource leaks

A Pythonic practice for resource handling is using Context Managers. Our analysis shows that resource leaks are rampant in Python code where a developer may open external files or windows and forget to close them eventually. A resource leak can slow down or crash your system. Even if a resource is closed, using Context Managers is Pythonic. For example, CodeGuru Reviewer detects resource leaks in the following code:

def read_lines(file):
    lines = []
    f = open(file, ‘r’)
    for line in f:
        lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

The following screenshot shows that CodeGuru Reviewer recommends that the developer either use the ContextLib with statement or use a try-finally block to explicitly close a resource.

The following screenshot shows CodeGuru Reviewer recommend about fixing the potential resource leak

The developer accepted the recommendation and fixed the code as shown below.

def read_lines(file):
    lines = []
    with open(file, ‘r’) as f: 
        for line in f:
            lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

Secure coding practices

Python is often used for scripting. An integral part of such scripts is the use of subprocesses. As of this writing, CodeGuru Reviewer makes a limited, but important set of recommendations to make sure that your use of eval functions or subprocesses is secure from potential shell injections. It issues a warning if it detects that the command used in eval or subprocess scenarios might be influenced by external factors. For example, see the following code:

def execute(cmd):
    try:
        retcode = subprocess.call(cmd, shell=True)
        ...
    except OSError as e:
        ...

The following screenshot shows the CodeGuru Reviewer recommendation:

The following screenshot shows CodeGuru Reviewer recommends about potential shell injection vulnerability

The developer accepted this recommendation and made the following fix.

def execute(cmd):
    try:
        retcode = subprocess.call(shlex.quote(cmd), shell=True)
        ...
    except OSError as e:
        ...

As shown in the preceding recommendations, not only are the code issues detected, but a detailed recommendation is also provided on how to fix the issues, along with a link to the Python official documentation. You can provide feedback on recommendations in the CodeGuru Reviewer console or by commenting on the code in a pull request. This feedback helps improve the performance of Reviewer so that the recommendations you see get better over time.

Now let’s take a look at CodeGuru Profiler.

CodeGuru Profiler for Python

Amazon CodeGuru Profiler analyzes your application’s performance characteristics and provides interactive visualizations to show you where your application spends its time. These visualizations a. k. a. flame graphs are a powerful tool to help you troubleshoot which code methods have high latency or are over utilizing your CPU.

Thanks to the new Python agent, you can now use CodeGuru Profiler on your Python applications to investigate performance issues.

The following list summarizes the supported versions as of this writing.

  • AWS Lambda functions: Python3.8, Python3.7, Python3.6
  • Other environments: Python3.9, Python3.8, Python3.7, Python3.6

Onboarding your Python application

For this post, let’s assume you have a Python application running on Amazon Elastic Compute Cloud (Amazon EC2) hosts that you want to profile. To onboard your Python application, complete the following steps:

1. Create a new profiling group in CodeGuru Profiler console called ProfilingGroupForMyApplication. Give access to your Amazon EC2 execution role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.

2. Install the codeguru_profiler_agent module:

pip3 install codeguru_profiler_agent

3. Start the profiler in your application.

An easy way to profile your application is to start your script through the codeguru_profiler_agent module. If you have an app.py script, use the following code:

python -m codeguru_profiler_agent -p ProfilingGroupForMyApplication app.py

Alternatively, you can start the agent manually inside the code. This must be done only one time, preferably in your startup code:

from codeguru_profiler_agent import Profiler

if __name__ == "__main__":
     Profiler(profiling_group_name='ProfilingGroupForMyApplication')
     start_application()    # your code in there....

Onboarding your Python Lambda function

Onboarding for an AWS Lambda function is quite similar.

  1. Create a profiling group called ProfilingGroupForMyLambdaFunction, this time we select “Lambda” for the compute platform. Give access to your Lambda function role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.
  2. Include the codeguru_profiler_agent module in your Lambda function code.
  3. Add the with_lambda_profiler decorator to your handler function:
from codeguru_profiler_agent import with_lambda_profiler

@with_lambda_profiler(profiling_group_name='ProfilingGroupForMyLambdaFunction')
def handler_function(event, context):
      # Your code here

Alternatively, you can profile an existing Lambda function without updating the source code by adding a layer and changing the configuration. For more information, see Profiling your applications that run on AWS Lambda.

Profiling a Lambda function helps you see what is slowing down your code so you can reduce the duration, which reduces the cost and improves latency. You need to have continuous traffic on your function in order to produce a usable profile.

Viewing your profile

After running your profile for some time, you can view it on the CodeGuru console.

Screenshot of Flame graph visualization by CodeGuru Profiler

Each frame in the flame graph shows how much that function contributes to latency. In this example, an outbound call that crosses the network is taking most of the duration in the Lambda function, caching its result would improve the latency.

For more information, see Investigating performance issues with Amazon CodeGuru Profiler.

Supportability for CodeGuru Profiler is documented here.

If you don’t have an application to try CodeGuru Profiler on, you can use the demo application in the following GitHub repo.

Conclusion

This post introduced how to leverage CodeGuru Reviewer to identify hard-to-find code defects in various issue categories and how to onboard your Python applications or Lambda function in CodeGuru Profiler for CPU profiling. Combining both services can help you improve code quality for Python applications. CodeGuru is now available for you to try. For more pricing information, please see Amazon CodeGuru pricing.

 

About the Authors

Neela Sawant is a Senior Applied Scientist in the Amazon CodeGuru team. Her background is building AI-powered solutions to customer problems in a variety of domains such as software, multimedia, and retail. When she isn’t working, you’ll find her exploring the world anew with her toddler and hacking away at AI for social good.

 

 

Pierre Marieu is a Software Development Engineer in the Amazon CodeGuru Profiler team in London. He loves building tools that help the day-to-day life of other software engineers. Previously, he worked at Amadeus IT, building software for the travel industry.

 

 

 

Ran Fu is a Senior Product Manager in the Amazon CodeGuru team. He has a deep customer empathy, and love exploring who are the customers, what are their needs, and why those needs matter. Besides work, you may find him snowboarding in Keystone or Vail, Colorado.

 

These AIs Can Predict Your Moral Principles

Post Syndicated from Michelle Hampson original https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/these-ais-can-predict-your-moral-principles

The death penalty, abortion, gun legislation: There’s no shortage of controversial topics that are hotly debated today on social media. These topics are so important to us because they touch on an essential underlying force that makes us human, our morality.

Researchers in Brazil have developed and analyzed three models that can describe the morality of individuals based on the language they use. The results were published last month in IEEE Transactions on Affective Computing.

Ivandré Paraboni is an associate professor at the School of Arts, Sciences and Humanities at the University of São Paulo who led the study. His team choose to focus on a theory commonly used by social scientists called Moral foundations theory. It postulates several key categories of morality including care, fairness, loyalty, authority, and purity. 

The aim of the new models, according to Paraboni, is to infer values of those five moral foundations just by looking at their writing, regardless of what they are talking about. “They may be talking about their everyday life, or about whatever they talk about on social media,” Paraboni says. “And we may still find underlying patterns that are revealing of their five moral foundations.”

To develop and validate the models, Paraboni’s team provided more than 500 volunteers with questionnaires. Participants were asked to rate eight topics (e.g., same sex marriage, gun ownership, drug policy) with sentiment scores (from 0 = ‘totally against’ to 5 = ‘totally in favor’). They were also asked to write out explanations of their ratings.

Human judges then gave their own rating to a subset of explanations from participants. The exercise determined how well humans could infer the intended opinions from the text. “Knowing the complexity of the task from a human perspective in this way gave us a more realistic view of what the computational models can or cannot do with this particular dataset,” says Paraboni.

Using the text opinions from the study participants, the research team created three machine learning algorithms that could assess the language used in each participant’s statement. The models analyzed psycholinguistics (emotional context of words), words, and word sequences, respectively.

All three models were able to infer an individual’s moral foundations from the text. The first two models, which focus on individual words used by the author, were more accurate than the deep learning approach that analyzes word sequences.

Paraboni adds, “Word counts–such as how often an individual uses words like ‘sin’ or ‘duty’–turned out to be highly revealing of their moral foundations, that is, predicting with higher accuracy their degrees of care, fairness, loyalty, authority, and purity.”

He says his team plans to continue to incorporate other forms of linguistic analysis into their models. They are, he says, exploring other models that focus more on the text (independent of the author) as a way to analyze Twitter data.

COVID Moonshot Effort Generates “Elite” Antivirals

Post Syndicated from Megan Scudellari original https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/covid-moonshot-generates-elite-antivirals

IEEE COVID-19 coverage logo, link to landing page

In March, organizers of the COVID Moonshot initative crowdsourced chemical designs for COVID-19 antivirals. They received over 14,000 submissions from chemists around the world.

PostEra, a machine-learning company leading the Moonshot initiative, triaged those submissions for how quickly and easily each chemical compound could be synthesized. One looked particularly promising, and PostEra sent data about the compound back to an online volunteer crowd of medicinal chemists.

The crowd and PostEra’s machine-learning algorithms iterated back and forth, designing and testing tweaks on the chemical structure. Soon, the compound’s potency had increased by two orders of magnitude. Then, the chemical compound successfully killed live coronavirus in human cells without harming the cells. Now, that drug candidate and three more promising compounds are headed to animal testing in preparation for human clinical trials.

Preparing data for ML models using AWS Glue DataBrew in a Jupyter notebook

Post Syndicated from Zayd Simjee original https://aws.amazon.com/blogs/big-data/preparing-data-for-ml-models-using-aws-glue-databrew-in-a-jupyter-notebook/

AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). In this post, we examine a sample ML use case and show how to use DataBrew and a Jupyter notebook to upload a dataset, clean and normalize the data, and train and publish an ML model. We look for anomalies by applying the Amazon SageMaker Random Cut Forest (RCF) anomaly detection algorithm on a public dataset that records power consumption for more than 300 random households.

Deploying your resources

To make it easier for you to get started, we created an AWS CloudFormation template that automatically configures a Jupyter notebook instance with the required libraries and installs the plugin. We used Amazon Deep Learning AMI to configure the out-of-the-box Jupyter server. This easy deployment is intended to get you started on DataBrew from within a Jupyter environment. The source code for the DataBrew plugin and the CloudFormation template are available in the GitHub repo.

To deploy the solution, you must have a subnet that has internet access and an Amazon Simple Storage Service (Amazon S3) bucket where you want store the data for DataBrew. Select the VPC, subnet, security group, and the S3 bucket that you want to use store the data for DataBrew processing. Provide the Amazon Elastic Compute Cloud (Amazon EC2) key pair if you plan to SSH to the instance.

  1. Launch the following stack:
  2. When the template deployment is complete, on the Outputs tab, choose the URL to open JupyterLab.

Because the Jupyter server is configured with a self-signed SSL certificate, your browser warns you and prompts you to avoid continuing to this website. But because you set this up yourself, it’s safe to continue.

  1. Choose Advanced.
  2. Choose Proceed.
  3. Use the password databrew_demo to log in.

For more information about securing and configuring your Jupyter server, see Set up a Jupyter Notebook Server.

  1. In the Jupyter environment’s left panel, choose the DataBrew logo.
  2. Choose Launch AWS Glue DataBrew.
  3. When the extension loads, choose the Datasets tab in the navigation bar.

Preparing your data using DataBrew

Now that the DataBrew extension is ready to go, we can begin to explore how DataBrew can make data preparation easy. Our source dataset contains data points at 15-minute intervals, and is organized as a series of columns for each household. The dataset is really wide, and the RCF algorithm expects data tuples of date/time, client ID, and consumption value. Additionally, we want to normalize our data to 1-hour intervals. All of this is achieved through DataBrew.

Setting up your dataset and project

To get started, you set up your dataset, import your data, and create a new project.

  1. Download power.consumption.csv from the GitHub repo.
  2. On the Datasets page, choose Connect new dataset.
  3. For Dataset name, enter a name.
  4. In the Connect to new dataset section, choose File upload.
  5. Upload power.consumption.csv.
  6. For Enter S3 destination, enter an S3 path where you can save the file.
  7. Choose Create dataset.

The file may take a few minutes to upload, depending on your internet speed.

  1. On the Datasets page, filter for your created dataset.
  2. Select your dataset and choose Create project with this dataset.

  1. In the Create project wizard, give your project a name.
  2. In the Permissions section, choose the AWS Identity and Access Management (IAM) role created from the CloudFormation template.

You can find the role on the CloudFormation stack’s Resources tab. If you use the default stack name, the role should begin with databrew-jupyter-plugin-demo.

After you create the project, the project view loads, and you’re ready to prepare your data.

Building a recipe for data transformation

A recipe is a series of steps that prepare your data for the RCF algorithm. The algorithm requires three columns: date, client ID, and an integer value. To transform our dataset to contain those three columns, we configure our recipe to do the following:

  1. Unpivot the data to collapse measurements from multiple clients into one column.
  2. Apply the window function to average the 15-minute data points into 1-hour data points.
  3. Filter to keep only values at each hour.
  4. Multiply and floor the results.

Unpivoting the data

To unpivot the data and collapse measurements, complete the following steps:

  1. On the toolbar, choose Pivot.

  1. In the Pivot wizard, select Unpivot: Columns to rows.
  2. For Unpivot columns, choose MT_012, MT_013, MT_131, and MT_132.
  3. For Column name, enter client_id.
  4. For Value column name, enter quarter_hour_consumption.

In the Recipe pane, you can see the action that unpivots the columns. This action can be revisited later and changed. The new columns may not be visible immediately.

  1. To see them and narrow down the visible data to only the relevant columns, choose the arrow next to Viewing.
  2. Deselect all items and select only _c0 and our two new columns, client_id and
    quarter_hour_consumption.

Applying the window function

To apply the window function to average the 15-minute data points into 1-hour data points, complete the following steps:

  1. Choose the quarter_hour_consumption
  2. Choose Functions.
  3. Choose Window functions.
  4. Choose Rolling average.

  1. In the Create column pane, for Number of rows before, enter 0.
  2. For Number of rows after, enter 3.
  3. For Name of column to order by with, choose client_id.
  4. For Destination column, enter hourly_consumption_raw.
  5. Choose Apply.

Filtering to keep only values at each hour

In this step, you rename the date/time column, convert it to string type so that you can do simple filtering, and filter the dataset on the string column for times ending in :00:00.

  1. For the _c0 column, choose the ellipsis icon (…) and choose Rename.

  1. Rename the column to timestamp.
  2. Choose the clock icon and choose string.

  1. With the column selected, choose Filter.
  2. Choose By condition.
  3. Choose Ends with.
  4. Enter the value :00:00.
  5. Choose Apply.

Filtering the column for only values that end with :00:00 leaves you with hourly averages of power consumption per client for every hour.

Multiplying and flooring the results

In this step, you multiply the data by 100 to increase precision and floor the data so that it can be accepted by the RCF algorithm, which only accepts integers.

  1. Choose Functions.
  2. Choose Math functions.
  3. Choose Multiply.
  4. For Value using¸ choose Source columns and value.
  5. For Source column, choose hourly_consumption_raw.
  6. For Destination column, enter hourly_consumption_raw_times_a_hundred.
  7. Choose Apply.

  1. Choose Functions.
  2. Choose Math functions.
  3. Choose Floor.
  4. For Source column, choose hourly_consumption_raw_times_a_hundred.
  5. For Destination column, enter hourly_consumption.
  6. Choose Apply.

This column contains the final, normalized data.

Running the job to transform the data

You’re now ready to transform the data.

  1. Choose Create job.

  1. Enter a job name and choose the dataset we created.
  2. Specify the S3 bucket you provided in the CloudFormation template.
  3. Choose the IAM role that AWS CloudFormation created (which we used earlier to create the project).
  4. Choose Create and run job.

The job may take up to 5 minutes to complete.

On the Job run history page for the job, view the output on the Amazon S3 console by choosing the link in the table.

That’s it, the data is now ready to use when training and deploying our ML model.

Training and deploying the ML model using prepared data

The data was already prepared using DataBrew via the plugin, so the next step is to train an ML model using that data. We provided a sample anomaly detection notebook that you can download.

In this sample notebook, you need to specify the S3 data location where you stored the output data from DataBrew. The notebook uses the IAM role attached to the EC2 instance profile that was created by AWS CloudFormation. You can follow through the notebook and when you provide the right S3 paths, the first step is to filter the specific columns we’re interested in and visualize the time series power consumption data.

The next step is to train a sample anomaly detection model using the SageMaker Random Cut Forest algorithm. We pick one of the time series available in the input Pandas DataFrame and train the anomaly detection model with the hyperparameter feature_dim set to 1, leaving the default values for other hyperparameters. We then create an estimator for Random Cut Forest and fit the model. In a few minutes, the training should be complete. In the next step, we create a predictor and deploy the model to a SageMaker endpoint.

Using the prepared data, we run the prediction and plot the results.

We use the anomaly detection baseline that is two standard deviations away from the mean score. The data shows an anomaly towards the end of the time series. With this information, that timeframe can be further investigated.

Finally, we clean up by deleting the SageMaker endpoint to prevent any ongoing charges.

Conclusion

We’ve walked you through the process of setting up the AWS Glue DataBrew Jupyter plugin in a Jupyter notebook environment. We used the plugin to prepare data, then trained and deployed an ML model in the same Jupyter environment using SageMaker.

Although we used a DLAMI Jupyter environment in this post, the DataBrew Jupyter extension also works on SageMaker notebooks. For installation instructions, see the GitHub repo.

DataBrew makes it easy to iterate through data preparation workflows. The resultant recipes and jobs are duplicable and can be run over discrete, large datasets. The DataBrew Jupyter plugin allows you to prepare your data seamlessly, in context, within your Jupyter notebook.


About the Authors

Zayd Simjee is a software engineer at Amazon. He’s interested in distributed systems, big data, and simplifying developer experience on the Cloud. He’s worked on a few Big Data services at AWS, and most recently completed work on the AWS Glue DataBrew release.

 

 

 

 

As a Principal Solutions Architect at Amazon Web Services, Karthik Sonti works with GSI partners to help enterprises realize transformational business outcomes using artificial intelligence, machine learning and data analytics

Polling Is Too Hard—for Humans

Post Syndicated from Steven Cherry original https://spectrum.ieee.org/podcast/artificial-intelligence/machine-learning/polling-is-too-hardfor-humans

Steven Cherry Hi, this is Steven Cherry, for Radio Spectrum.

The Literary Digest, a now-defunct magazine, was founded in 1890. It offered—despite what you’d expect from its name—condensed versions of news-analysis and opinion pieces. By the mid-1920s, it had over a million subscribers. Some measure of its fame and popularity stemmed from accurately predicting every presidential election from 1916 to 1932, based on polls it conducted of its ever-growing readership.

Then came 1936. The Digest predicted that Kansas Governor Alf Landon would win in a landslide over the incumbent, Franklin Delano Roosevelt. Landon in fact captured only 38 percent of the vote. Roosevelt won 46 of the U.S.’s 48 states, the biggest landslide in presidential history. The magazine never recovered from its gaffe and folded two years later.

The Chicago Tribune did recover from its 1948 gaffe, one of the most famous newspaper headlines of all time, “Dewey Defeats Truman”—a headline that by the way was corrected in the second edition that election night to read “Democrats Make Sweep of State Offices,” and by the final edition, “Early Dewey Lead Narrow; Douglas, Stevenson Win,” referring to candidates that year for Senator and Governor. The Senator, Paul Douglas, by the way, was no relation to an earlier Senator from Illinois a century ago, Stephen Douglas.

The Literary Digest’s error was due, famously, to the way it conducted its polls— its readership, even though a million strong, was woefully unrepresentative of the nation’s voters as a whole.

The Tribune’s gaffe was in part due to a printer’s strike that forced the paper to settle on a first-edition banner headline hours earlier than it otherwise would have, but it made the guess with great confidence in part because the unanimous consensus of the polling that year that had Dewey ahead, despite his running one of the most lackluster, risk-averse campaigns of all time.

Polls have been making mistakes ever since, and it’s always, fundamentally, the same mistake. They’re based on representative samples of the electorate that aren’t sufficiently representative.

After the election of 2016, in which the polling was not only wrong but itself might have inspired decisions that affected the outcome—where the Clinton campaign shepherded its resources; whether James Comey would hold a press conference—pollsters looked inward, re-weighted various variables, assured us that the errors of 2016 had been identified and addressed, and then proceeded to systematically mis-predict the 2020 presidential election much as they had four years earlier.

After a century of often-wrong results, it would be reasonable to conclude that polling is just too difficult for humans to get right.

But what about software? Amazon, Netflix, and Google do a remarkable job of predicting consumer sentiment, preferences, and behavior. Could artificial intelligence predict voter sentiment, preferences, and behavior?

Well, it’s not as if they haven’t tried. And results in 2020 were mixed. One system predicted Biden’s lead in the popular vote to be large, but his electoral college margin small—not quite the actual outcome. Another system was even further from the mark, giving Biden wins in Florida, Texas, and Ohio—adding up to a wildly off-base electoral margin.

One system, though, did remarkably well. As a headline in Fortune magazine put it the morning of election day, “The polls are wrong. The U.S. presidential race is a near dead heat, this AI ‘sentiment analysis’ tool says.” The AI tool predicted a popular vote of 50.2 percent for Biden, only about one-sixth of one percent from the actual total, and 47.3 percent for Trump, off by a mere one-tenth of one percent.

The AI compay that Fortune magazine referred to is called Expert.ai, and its Chief Technology Officer, Marco Varone, is my guest today.

Marco, welcome to the podcast.

Marco Varone Hi everybody.

Steven Cherry Marco, AI-based speech recognition has been pretty good for 20 years, AI has been getting better and better at fraud detection for 25 years. AI beat the reigning chess champion back in 1997. Why has it taken so long to apply AI to polling, which is, after all … well, even in 2017, was a $20.1 billion dollar industry, which is about $20 billion more than chess.

Marco Varone Well, there are two reasons for this. The first one, that if you wanted to apply artificial intelligence to this kind of problem, you need to have the capability of understanding language in a pretty specific, deep, and nuanced way. And it is something that, frankly, for many, many years was very difficult and required a lot of investment and a lot of work in trying to go deeper than the traditional shallow understanding of text. So this was one element.

The second element is that, as you have seen in this particular case, polls, on average, are working still pretty well. But there are particular events, in particular a situation where there is a clear gap between what has been predicted and the final result. And there is a tendency to say, okay, on average, the results are not so bad. So don’t change too much because we can make it with good results, without the big changes that are always requiring investment, modification, and a complex process.

I would say that it’s a combination of the technology that needed to become better and better in understanding the capability of really extracting insights and small nuances from any kind of communication and the fact that for other types of polls, the current situation is not so bad.

The fact [is] that now there is a growing amount of information that you can easily analyze because it is everywhere in every social network, every communication in every blog and comments, made it a bit easier to say, okay, now we have a better technology, even in specific situations we can have access to a huge amount of data. So let’s try it. And this is what we did. And I believe this will become a major trend in the future.

Steven Cherry Every AI system needs data; Expert.ai uses social posts. How does it work?

Marco Varone Well, the social posts are, I would say, the most valuable kind of content that you can analyze in a situation like this, because on one side, it is a type of content that we know. When I say we know, it means that we have a used this type of content for many other projects. It is normally the kind of content that we analyze for our traditional customers, looking for the actual comments and opinions about products, services, and particular events. Social content is easy to get—up to a point; with the recent scandals, it’s becoming a bit more difficult to have access to. A huge amount of social data in the past was a bit simpler—and also it is something where you can find really every kind of person, every kind of expression, and every kind of discussion.

So it’s easier to analyze this content, to extract a big amount of insight, a big amount of information, and trying to tune … to create reasonably solid models that can be tested in a sort of realtime—there is a continuous stream of social content. There is an infinite number of topics that are discussed. And so you have the opportunity to have something that is plentiful, that is cheap, but has a big [?] expression and where you can really tune your models and tune your algorithms in it much faster and more cost-effective way than with the other type of content.

Steven Cherry So that sort of thing requires something to count as a ground truth. What is what is your ground truth here?

Very, very, very good point … a very good question. The key point is that from the start, we have decided to invest a lot of money and a lot of efforts in creating a sort of representation of knowledge that we have stored in a big knowledge graph that has been crafted manually, initially.

So we created this knowledge representation that is a sort of representation of the world knowledge, in a reduced form, and the language and the way that you express this knowledge. And we created this solid foundation, manually, so we have been able to build on a very solid and very structured foundation. On top of this foundation, it was possible, as I mentioned, to add the new knowledge, working, analyzing a big amount of data, social data is an example, but there are many other types of data that we use to enrich our knowledge. And so we are not influenced, like many other approaches, from a bias that you can take from extracting knowledge only from data.

So it’s the start of a two-tier system where we have this solid ground-truth foundation—the knowledge and information that that expert linguists and the people that have a huge understanding of things that’s created. On top of that, we can add all the information that we can extract more or less automatically from a different type of data. We believe that this was a huge investment that we did during the years, but is paying big dividends and also giving us the possibility of understanding the language and the communication at a deeper level than with other approaches.

Steven Cherry And are you using only data from Twitter or from other social media as well?

Marco Varone No, no, we try to use as much social media as possible, the limitation sometimes is that Twitter is much easier and faster to have access to a bigger amount of information. For other social sources sometimes is not that easy because you can have issues in accessing the content or you have a very limited amount of information that you can download, or that is expensive—or some sources, you cannot really control them automatically. So Twitter becomes his first choice for the reason that it is easier to get a big volume. And if you are ready to pay, you can have access to the full Twitter firehose.

Steven Cherry The universe of people who post on social networks would seem to be skewed in any number of ways. Some people post more than the average. Some people don’t post much at all. People are sometimes more extreme in their views. How do you go from social media sentiment to voter sentiment? How do you avoid the Literary Digest problem?

Marco Varone Probably the most relevant element is our huge experience. Somehow we we have started to analyze the big amount of data, textual data, many, many years ago, and we were forced to really find a way of managing and balancing and avoiding this kind of noise or duplicated information or extra spurious information—[it] can really impact on the capability of our solution to extract the real insights.

So I think that experience—a lot of experience in doing this for many, many years—is the second the secret element of our recipe in being able to do this kind of analysis. And that I would add that also you should consider that if you do it several times, we started to analyze political content, things that link it to political elections a few years ago. So we also had this generic experience and a specific experience in finding how to tune the different parameters, how to set the different algorithms to try to minimize these kinds of noisy elements. You can’t remove them completely. It is impossible.

But for example, when we analyzed the social content for the Brexit referendum, in the UK, and we were able to guess—one of the few able to do this—the real result of it, we learned a lot of lessons and we were able to improve our capability. Clearly, this means that there is not that a formula that is good for every kind of analysis.

Steven Cherry It’s sort of a commonplace that people express more extreme views on the Internet than they do in face-to-face encounters. The results from 2016 and 2020—and the Brexit result as well—suggests that the opposite may be the case. People’s voting reflects truly-held extreme views, while the polling reflects a sort of face-to-face façade.

Marco Varone Yes, I must admit that we had a small advantage in this—compared with many other companies and probably many other players that tried to guess the result of this election or the Brexit—being based where our technology is. Here in Italy, we saw this kind of situation happening much sooner than we have seen happening in other countries. So in Italy, we had, even many years ago, the strange situation where people, when they were polled, for an interview, were saying, “Oh, no, I think that is too extreme. I will never vote for this. I will vote for this other candidate or the other party.” But in the end, when that the elections were over, you saw that, oh, this is not what really happened in the secret of the vote.

So I would say that this is a small secret, a small advantage that we have against many other people that try to guess this result, creating this kind of technology and implementation in Italy, where these small splits or exaggerated positioning decided the vote for the election was happening before then we have seen. Now it’s very common. This is happening not only in the U.S., but also in other countries. It was happening before … So we have been able to understand it sooner and try to adjust and balance our parameters accordingly.

Steven Cherry That’s so interesting. People have, of course, compared the Trump administration to the Berlusconi administration, but I didn’t realize that the comparison went back all the way to their initial candidacies. So in effect, the shy voter theory—especially the shy-Trump voter theory—is basically correct and people express themselves more authentically online.

Steven Cherry Correct. This is what we are seeing, again and again. And it is something that I believe is not only happening in the political environment, but there it’s somehow stronger than in other places. As I told you, we are applying our artificial intelligence solution in many different fields, analyzing the feedback from customers of telco companies, banks, insurance companies. And you see that when you look at, for example, the content of the e-mails, or let me say official communication that they are exchanging between the customer and the company, everything is a bit smoother, more natural. The tone is under control. And then that when you see the same kind of problem that is discussed in a social content, everything is stronger. People are really trying to give a much stronger opinion, saying, I’ll never buy this kind of service or I had big problems with this company.

And so, again, this is something that we have seen also in other spaces. In the political situation, I believe it is even stronger because they are really not buying something like when you are interacting with a company, but you are trying to give your small contribution to the future of your country or your state or your local government. So probably there are even stronger sentiments and feelings for people. And in the social situation, they are really free because you are not really identified—normally you can be recognized, but in many cases you are not linked to the specific person doing that. So I believe that that is the strongest place where there is this, “Okay, I really wanted to say what I think, and this is the only place where I will tell this, because the risk of having a sort of a negative result is smaller.”

Steven Cherry Yeah. So not to belabor the point, but it does seem important. It’s commonly thought that the Internet goads people into holding more extreme positions than they really do, but the reality is that it instead frees them to express themselves more honestly.

A 2015 article in Nature argued that public opinion has become more extreme over time, and then the article looks at some of the possible causes. I’m wondering if you have seen that in your work and is it possible that standard polling techniques simply have not caught up with that?

Marco Varone Yes, I think that we can confirm we have seen that this kind of change we have … We are applying our solution to social content for a good number of years. I would say not exactly from the start because you need to have the sort of a minimum amount of data but it’s been a big number of years. And I can confirm, yes, it’s something that we have seen that it is happening. I don’t know exactly if it is also something that is linked to the fact that people that are more vocal on such a content are also part of the new generation of people that are younger, that have been able to use these kinds of channels of communication actively more or less from the start. I think that there are different element on this, but for sure I can confirm this.

And in different countries, we have seen some significant variation. For example, you should expect that here in Italy it’s super-strong because the Italian people, for example, are considered very … They don’t fear to express their opinion, but I will say that in the U.S. and also in the U.K., we are seeing [it] even stronger. Okay, so it’s happening in all the countries where we are operating and there are some countries where it’s even stronger than another one. You will not be surprised that, for example, when you analyze there, the content in Germany, such a content, everything is somehow more under control, exactly as you expect. So sometimes there are surprises. In other situations that are things that are more or less as you expect.

Steven Cherry I mentioned earlier Amazon, Netflix and Google. Are there similarities between what you’re doing here and what recommendation engines do?

Marco Varone There are elements in common and there are significant differences. The elements in common that they are also using the capability that they have in analyzing the textual content to extract elements for the recommendation, but they are also using a lot of other information. For us that when you analyze something more or less, the only information that we can get access to is really that the tweets, the posts, the articles, and other similar things. But for Amazon, they have access—or for Netflix—to get a lot of other information. So on Amazon, you have the clicks, you have the story of the customer, you have the different path that has been followed in navigating the site. They have historical information. So they have a much richer set of data and the text part is only somehow a complement of it. So there are elements in common and differences. And the other difference is that all these companies have a very shallow capability of understanding what is really written—in a comment, in a post, in a tweet—they tend to work more or less on a keyword level. Okay, this is a negative keyword; this is a positive keyword. With our AI intelligence, we can go deeper than that. So we can get the emotion, the feeling—we can disambiguate much better small differences in the expression of the person because we can go to a deeper level of understanding. It is not like a person. A person is still better than understanding all the nuances, but it’s something that can add more value and allows us to compensate—up to a point—to the fact that we don’t have access to this huge set of other data that these big companies easily have because they track and they log everything.

Steven Cherry I’m not sure humans always do better. You know, one of my complaints about the movie rating site Rotten Tomatoes is they take reviews by film reviewers and assess whether the review was a generally positive or generally negative review. It’s incredibly simplistic. And yet, in my opinion, they often get it wrong. I’d love to see a sentiment analysis software attack the movie-rating problem. Speaking of which, polling is more of a way to show off your company’s capabilities, yes? Your main business involves applications in industries like insurance and banking and publishing?

Marco Varone Correct. Absolutely. We decided that we would do it from time to time, as is said, to apply our technology and our solutions to this specific problem, not because we want to become a competitor of the companies doing these polls, but because we think it is a very good way to show the capability and the power of our technology and our solution, applied to a problem that is easily understood by everybody.

Normally what we do is to apply this kind of approach, for example, in analyzing the customer interaction between the customers and our clients or analyzing big amounts of social content to identify trends, patterns, emerging elements that can be emerging technologies or managing challenges.

Part of our customers are also in the intelligence space. So public police forces, national security, intelligence agencies … and use our AI platform to try to recognize possible threats, to help investigators and analysts to find the information that they want to find in a much faster and more structured way. Finally, I will say that our historical market is in publishing. Publishers are always searching for a way to enrich the content that they publish with the additional metadata so that the people reading and navigating inside the knowledge can really slice and dice the information across many dimensions or can then focus on specific topics, a specific place, or specific type of event.

Steven Cherry Returning to polling, the Pew Research Center is just one of many polling organizations that looked inward after 2020 and as far as I can tell, concluded that it needed to do still better sampling and weighting of voters. In other words, they just need to do a better job of what they had been doing. Do you think they could ever succeed at that or are they just on a failed path and they really need to start doing something more like what you’re doing?

Marco Varone I think that they are on a failed path and they need to really merge the two approaches. I believe that for the future, they really need to keep the good part of what they did for many, many years, because there is still a lot of value in that. But they are obliged to add this additional dimension because only working together with these two approaches, you can really find something that can give a good result. And I would say good prediction in the majority of the situations, even in these extreme events that are becoming more and more common. And this is sort of a part of how the world is changing.

So we think that they need to look at the kind of artificial technology, artificial intelligence technologies that we and other companies are making available because you cannot continue. This is not a problem of tuning the existing formulas. They should not discard it. It would be a big mistake, but for sure, in my opinion, they need a tool to blend the two things and spend the time to balance this combined model, because, again, if you just then merge the two approaches without spending time on balancing, the result would be even worse than what they have now.

Steven Cherry Well, Marco, I think that’s a very natural human need to predict the future, to help us plan accordingly, and a very natural cultural need to understand where our fellow citizens stand and feel and think about the important issues that face us. Polling tries to meet those needs. And if it’s been on the wrong path these many years, I hope there’s a right path and hopefully you’re pointing the way to it. Thanks for your work and for joining us today.

Marco Varone Thank you. It was a pleasure.

Steven Cherry We’ve been speaking with Marco Varone, CTO of Expert.ai, about polling, prediction, social media, and natural language processing.

Radio Spectrum is brought to you by IEEE Spectrum, the member magazine of the Institute of Electrical and Electronic Engineers, a professional organization dedicated to advancing technology for the benefit of humanity.

This interview was recorded November 24, 2020. Our theme music is by Chad Crouch.

You can subscribe to Radio Spectrum on the Spectrum website, Spotify, Apple Podcast, or wherever you get your podcasts. You sign up for alerts or for our upcoming newsletter. And we welcome your feedback on the web or in social media.

For Radio Spectrum, I’m Steven Cherry.

Note: Transcripts are created for the convenience of our readers and listeners. The authoritative record of IEEE Spectrum’s audio programming is the audio version.

We welcome your comments on Twitter (@RadioSpectrum1 and @IEEESpectrum) and Facebook.

Noisy and Stressful? Or Noisy and Fun? Your Phone Can Tell the Difference

Post Syndicated from Tekla S. Perry original https://spectrum.ieee.org/view-from-the-valley/artificial-intelligence/embedded-ai/noisy-and-stressful-or-noisy-and-fun-your-phone-can-tell-the-difference

Smartphones for several years now have had the ability to listen non-stop for wake words, like “Hey Siri” and “OK Google,” without excessive battery usage. These wake-up systems run in special, low-power processors embedded within a phone’s larger chip set. They rely on algorithms trained on a neural network to recognize a broad spectrum of voices, accents, and speech patterns. But they only recognize their wake words; more generalized speech recognition algorithms require the involvement of a phone’s more powerful processors.

Today, Qualcomm announced that Snapdragon 8885G, its latest chipset for mobile devices, will be incorporating an extra piece of software in that bit of semiconductor real estate that houses the wake word recognition engine. Created by Cambridge, U.K. startup Audio Analytic, the ai3-nano will use the Snapdragon’s low-power AI processor to listen for sounds beyond speech. Depending on the applications made available by smartphone manufacturers, the phones will be able to react to such sounds as a doorbell, water boiling, a baby’s cry, and fingers tapping on a keyboard—a library of some 50 sounds that is expected to grow to 150 to 200 in the near future.

The first application available for this sound recognition system will be what Audio Analytic calls Acoustic Scene Recognition AI. Instead of listening for just one sound, the scene recognition technology listens for the characteristics of all the ambient sounds to classify an environment as chaotic, lively, boring, or calm. Audio Analytic CEO and founder Chris Mitchell explains.

“There are two aspects to an environment,” he says, “eventfulness, which refers to how many individual sounds are going on, and how pleasant we find it. Say I went for a run, and there were lots of bird sounds. I would likely find that pleasant, so that would be categorized as ‘lively.’ You could also have an environment with a lot of sounds that are not pleasant. That would be ‘chaotic.’”

Mitchell’s team selected those four categories after reviewing studies about perceptions of sound. They then used its custom-created dataset of 30 million audio recordings to train the neural network.

What a mobile device will do with its newfound awareness of ambient sounds will be up to the manufacturers that use the Qualcomm platform. But Mitchell has a few ideas.

“A train, for example, is boring,” he says. “So you might want to increase the active noise cancellation on your headphones to remove the typical low hum.  But when you get off the tube, you want more transparency—so you can hear bike messengers, so noise cancellation should be reduced. On a smartphone you could also adjust notifications based on the type of environment, whether it vibrates or rings, or what sort of ring tone is used.”

I first met Mitchell two years ago, when the company was demonstrating prototypes of how its audio analysis technology would work in smart speakers. Since then, Mitchell reports, products using the company’s technology are available in some 150 countries. Most are security and safety systems, recognizing the sound of breaking glass, a smoke alarm, or a baby’s cry.

Audio Analytic’s approach, Mitchell explained to me, involves using deep learning to break sounds into standard components. He uses the word “ideophones” to refer to these components. The term also refers to the representation of a sound in speech, like “quack.” Once sounds are coded as ideophones, each can be recognized just as digital assistants’ systems recognize their wake words. This approach allows the ai3-nano engine to take up just 40 KB and run completely on the phone without connecting to a cloud-based processor.

Once the technology is established in smartphones, Mitchell expects its applications will grow beyond security and scene recognition. Early instances, he expects, will include media tagging, games, and accessibility.

For media tagging, he says, the system can search phone-captured video by sound. So, for example, a parent can easily find a clip of a child laughing. Or children could use this technology in a game that has them make the sounds of an animal—say a duck or a pig. Then for completing the task, the display could put a virtual costume on them.

As for accessibility, Mitchell sees the technology as a boon to the hard of hearing, who already rely on mobile phones as assistive devices. “This can allow them to detect [and specifically identify] a knock on the door, a dog barking or a smoke alarm,” he says.

After rolling out additional sound recognition capabilities, they expect to work next on identifying context beyond specific events or scenes. “We have started doing early stage research in that area,” he says. “So our system can say ‘It sounds like you are making breakfast’ or ‘It sounds like you are getting ready to leave the house.’” Which would allow apps to take advantage of that information in arming a security system or adjusting lights or heat.  

Incorporating security in code-reviews using Amazon CodeGuru Reviewer

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/incorporating-security-in-code-reviews-using-amazon-codeguru-reviewer/

Today, software development practices are constantly evolving to empower developers with tools to maintain a high bar of code quality. Amazon CodeGuru Reviewer offers this capability by carrying out automated code-reviews for developers, based on the trained machine learning models that can detect complex defects and providing intelligent actionable recommendations to mitigate those defects. A quick overview of CodeGuru is covered in this blog post.

Security analysis is a critical part of a code review and CodeGuru Reviewer offers this capability with a new set of security detectors. These security detectors introduced in CodeGuru Reviewer are geared towards identifying security risks from the top 10 OWASP categories and ensures that your code follows best practices for AWS Key Management Service (AWS KMS), Amazon Elastic Compute Cloud (Amazon EC2) API, and common Java crypto and TLS/SSL libraries. As of today, CodeGuru security analysis supports Java language, thus we will take an example of a Java application.

In this post, we will walk through the on-boarding workflow to carry out the security analysis of the code repository and generate recommendations for a Java application.

 

Security workflow overview:

The new security workflow, introduced for CodeGuru reviewer, utilizes the source code and build artifacts to generate recommendations. The security detector evaluates build artifacts to generate security-related recommendations whereas other detectors continue to scan the source code to generate recommendations. With the use of build artifacts for evaluation, the detector can carry out a whole-program inter-procedural analysis to discover issues that are caused across your code (e.g., hardcoded credentials in one file that are passed to an API in another) and can reduce false-positives by checking if an execution path is valid or not. You must provide the source code .zip file as well as the build artifact .zip file for a complete analysis.

Customers can run a security scan when they create a repository analysis. CodeGuru Reviewer provides an additional option to get both code and security recommendations. As explained in the following sections, CodeGuru Reviewer will create an Amazon Simple Storage Service (Amazon S3) bucket in your AWS account for that region to upload or copy your source code and build artifacts for the analysis. This repository analysis option can be run on Java code from any repository.

 

Prerequisites

Prepare the source code and artifact zip files: If you do not have your Java code locally, download the source code that you want to evaluate for security and zip it. Similarly, if needed, download the build artifact .jar file for your source code and zip it. It will be required to upload the source code and build artifact as separate .zip files as per the instructions in subsequent sections. Thus even if it is a single file (eg. single .jar file), you will still need to zip it. Even if the .zip file includes multiple files, the right files will be discovered and analyzed by CodeGuru. For our sample test, we will use src.zip and jar.zip file, saved locally.

Creating an S3 bucket repository association:

This section summarizes the high-level steps to create the association of your S3 bucket repository.

1. On the CodeGuru console, choose Code reviews.

2. On the Repository analysis tab, choose Create repository analysis.

Screenshot of initiating the repository analysis

Figure: Screenshot of initiating the repository analysis

 

3. For the source code analysis, select Code and security recommendations.

4. For Repository name, enter a name for your repository.

5. Under Additional settings, for Code review name, enter a name for trackability purposes.

6. Choose Create S3 bucket and associate.

Screenshot to show selection of Security Code Analysis

Figure: Screenshot to show selection of Security Code Analysis

It takes a few seconds to create a new S3 bucket in the current Region. When it completes, you will see the below screen.

Screenshot for Create repository analysis showing S3 bucket created

Figure: Screenshot for Create repository analysis showing S3 bucket created

 

7. Choose Upload to the S3 bucket option and under that choose Upload source code zip file and select the zip file (src.zip) from your local machine to upload.

Screenshot of popup to upload code and artifacts from S3 bucket

Figure: Screenshot of popup to upload code and artifacts from S3 bucket

 

8. Similarly, choose Upload build artifacts zip file and select the zip file (jar.zip) from your local machine and upload.

 

Screenshot for Create repository analysis showing S3 paths populated

Figure: Screenshot for Create repository analysis showing S3 paths populated

 

Alternatively, you can always upload the source code and build artifacts as zip file from any of your existing S3 bucket as below.

9. Choose Browse S3 buckets for existing artifacts and upload from there as shown below:

 

Screenshot to upload code and artifacts from S3 bucket

Figure: Screenshot to upload code and artifacts from an existing S3 bucket

 

10. Now click Create repository analysis and trigger the code review.

A new pending entry is created as shown below.

 

Screenshot of code review in Pending state

Figure: Screenshot of code review in Pending state

After a few minutes, you would see the recommendations generate that would include security analysis too. In the below case, there are 10 recommendations generated.

Screenshot of repository analysis being completed

Figure: Screenshot of repository analysis being completed

 

For the subsequent code reviews, you can use the same repository and upload new files or create a new repository as shown below:

 

Screenshot of subsequent code review making repository selection

Figure: Screenshot of subsequent code review making repository selection

 

Recommendations

Apart from detecting the security risks from the top 10 OWASP categories, the security detector, detects the deep security issues by analyzing data flow across multiple methods, procedures, and files.

The recommendations generated in the area of security are labelled as Security. In the below example we see a recommendation to remove hard-coded credentials and a non-security-related recommendation about refactoring of code for better maintainability.

Screenshot of Recommendations generated

Figure: Screenshot of Recommendations generated

 

Below is another example of recommendations pointing out the potential resource-leak as well as a security issue pointing to a potential risk of path traversal attack.

Screenshot of deep security recommendations

Figure: More examples of deep security recommendations

 

As this blog is focused on on-boarding aspects, we will cover the explanation of recommendations in more detail in a separate blog.

Disassociation of Repository (optional):

The association of CodeGuru to the S3 bucket repository can be removed by following below steps. Navigate to the Repositories page, select the repository and choose Disassociate repository.

Screenshot of disassociating the S3 bucket repo with CodeGuru

Figure: Screenshot of disassociating the S3 bucket repo with CodeGuru

 

Conclusion

This post reviewed the support for on-boarding workflow to carry out the security analysis in CodeGuru Reviewer. We initiated a full repository analysis for the Java code using a separate UI workflow and generated recommendations.

We hope this post was useful and would enable you to conduct code analysis using Amazon CodeGuru Reviewer.

 

About the Author

Author's profile photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.

Tightening application security with Amazon CodeGuru

Post Syndicated from Brian Farnhill original https://aws.amazon.com/blogs/devops/tightening-application-security-with-amazon-codeguru/

Amazon CodeGuru is a developer tool powered by machine learning (ML) that provides intelligent recommendations for improving code quality and identifies an application’s most expensive lines of code. To help you find and remediate potential security issues in your code, Amazon CodeGuru Reviewer now includes an expanded set of security detectors. In this post, we discuss the new types of security issues CodeGuru Reviewer can detect.

Time to read9 minutes
Services usedAmazon CodeGuru

The new security detectors are now a feature in CodeGuru Reviewer for Java applications. These detectors focus on finding security issues in your code before you deploy it. They extend CodeGuru Reviewer by providing additional security-specific recommendations to the existing set of application improvements it already recommends. When an issue is detected, a remediation recommendation and explanation is generated. This allows you to find and remediate issues before the code is deployed. These findings can help in addressing the OWASP top 10 web application security risks, with many of the recommendations being based on specific issues customers have had in this space.

You can run a security scan by creating a repository analysis. CodeGuru Reviewer now provides an additional option to get both code and security recommendations for Java codebases. Selecting this option enables you to find potential security vulnerabilities before they are promoted to production, and support users remaining secure when using your service.

Types of security issues CodeGuru Reviewer detects

Previously, CodeGuru Reviewer helped address security by detecting potential sensitive information leaks (such as personally identifiable information or credit card numbers). The additional CodeGuru Reviewer security detectors expand on this by addressing:

  • AWS API security best practices – Helps you follow security best practices when using AWS APIs, such as avoiding hard-coded credentials in API calls
  • Java crypto library best practices – Identifies when you’re not using best practices for common Java cryptography libraries, such as avoiding outdated cryptographic ciphers
  • Secure web applications – Inspects code for insecure handling of untrusted data, such as not sanitizing user-supplied input to protect against cross-site scripting, SQL injection, LDAP injection, path traversal injection, and more
  • AWS Security best practices – Developed in collaboration with AWS Security, these best practices help bring our internal expertise to customers

Examples of new security findings

The following are examples of findings that CodeGuru Reviewer security detectors can now help you identify and resolve.

AWS API security best practices

AWS API security best practice detectors inspect your code to identify issues that can be caused by not following best practices related to AWS SDKs and APIs. An example of a detected issue in this category is using hard-coded AWS credentials. Consider the following code:

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;

static String myKeyId ="AKIAX742FUDUQXXXXXXX";
static String mySecretKey = "MySecretKey";

public static void main(String[] args) {
    AWSCredentials creds = getCreds(myKeyId, mySecretKey);
}

static AWSCredentials getCreds(String id, String key) {
    return new BasicAWSCredentials(id, key);}
}

In this code, the variables myKeyId and mySecretKey are hard-coded in the application. This may have been done to move quickly, but it can also lead to these values being discovered and misused.

In this case, CodeGuru Reviewer recommends using environment variables or an AWS profile to store these values, because these can be retrieved at runtime and aren’t stored inside the application (or its source code). Here you can see an example of what this finding looks like in the console:

An example of the CodeGuru reviewer finding for IAM credentials in the AWS console

The recommendation suggests using environment variables or an AWS profile instead, and that after you delete or rotate the affected key you monitor it with CloudWatch for any attempted use. Following the learn more link, you’ll see additional detail and recommended approaches for remediation, such as using the DefaultAWSCredentialsProviderChain. An example of how to remediate this in the preceding code is to update the getCreds() function:

import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;

static AWSCredentials getCreds() {
    DefaultAWSCredentialsProviderChain creds =
        new DefaultAWSCredentialsProviderChain();
    return creds.getCredentials();
}

Java crypto library best practices

When working with data that must be protected, cryptography provides mechanisms to encrypt and decrypt the information. However, to ensure the security of this data, the application must use a strong and modern cipher. Consider the following code:

import javax.crypto.Cipher;

static final String CIPHER = "DES";

public void run() {
    cipher = Cipher.getInstance(CIPHER);
}

A cipher object is created with the DES algorithm. CodeGuru Reviewer recommends a stronger cipher to help protect your data. This is what the recommendation looks like in the console:

An example of the CodeGuru reviewer finding for encryption ciphers in the AWS console

Based on this, one example of how to address this is to substitute a different cipher:

static final String CIPHER ="RSA/ECB/OAEPPadding";

This is just one option for how it could be addressed. The CodeGuru Reviewer recommendation text suggests several options, and a link to documentation to help you choose the best cipher.

Secure web applications

When working with sensitive information in cookies, such as temporary session credentials, those values must be protected from interception. This is done by flagging the cookies as secure, which prevents them from being sent over an unsecured HTTP connection. Consider the following code:

import javax.servlet.http.Cookie;

public static void createCookie() {
    Cookie cookie = new Cookie("name", "value");
}

In this code, a new cookie is created that is not marked as secure. CodeGuru Reviewer notifies you that you could make a correction by adding:

cookie.setSecure(true);

This screenshot shows you an example of what the finding looks like.

An example CodeGuru finding that shows how to ensure cookies are secured.

AWS Security best practices

This category of detectors has been built in collaboration with AWS Security and assists in detecting many other issue types. Consider the following code, which illustrates how a string can be re-encrypted with a new key from AWS Key Management Service (AWS KMS):

import java.nio.ByteBuffer;
import com.amazonaws.services.kms.AWSKMS;
import com.amazonaws.services.kms.AWSKMSClientBuilder;
import com.amazonaws.services.kms.model.DecryptRequest;
import com.amazonaws.services.kms.model.EncryptRequest;

AWSKMS client = AWSKMSClientBuilder.standard().build();
ByteBuffer sourceCipherTextBlob = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 0});

DecryptRequest req = new DecryptRequest()
                         .withCiphertextBlob(sourceCipherTextBlob);
ByteBuffer plainText = client.decrypt(req).getPlaintext();

EncryptRequest res = new EncryptRequest()
                         .withKeyId("NewKeyId")
                         .withPlaintext(plainText);
ByteBuffer ciphertext = client.encrypt(res).getCiphertextBlob();

This approach puts the decrypted value at risk by decrypting and re-encrypting it locally. CodeGuru Reviewer recommends using the ReEncrypt method—performed on the server side within AWS KMS—to avoid exposing your plaintext outside AWS KMS. A solution that uses the ReEncrypt object looks like the following code:

import com.amazonaws.services.kms.model.ReEncryptRequest;

ReEncryptRequest req = new ReEncryptRequest()
                           .withCiphertextBlob(sourceCipherTextBlob)
                           .withDestinationKeyId("NewKeyId");

client.reEncrypt(req).getCiphertextBlob();

This screenshot shows you an example of what the finding looks like.

An example CodeGuru finding to show how to avoid decrypting and encrypting locally when it's not needed

Detecting issues deep in application code

Detecting security issues can be made more complex by the contributing code being spread across multiple methods, procedures and files. This separation of code helps ensure humans work in more manageable ways, but for a person to look at the code, it obscures the end to end view of what is happening. This obscurity makes it harder, or even impossible to find complex security issues. CodeGuru Reviewer can see issues regardless of these boundaries, deeply assessing code and the flow of the application to find security issues throughout the application. An example of this depth exists in the code below:

import java.io.UnsupportedEncodingException;
import javax.servlet.http.Cookie;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

private String decode(final String val, final String enc) {
    try {
        return java.net.URLDecoder.decode(val, enc);
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
    return "";
}

public void pathTraversal(HttpServletRequest request) throws IOException {
    javax.servlet.http.Cookie[] theCookies = request.getCookies();
    String path = "";
    if (theCookies != null) {
        for (javax.servlet.http.Cookie theCookie : theCookies) {
            if (theCookie.getName().equals("thePath")) {
                path = decode(theCookie.getValue(), "UTF-8");
                break;
            }
        }
    }
    if (!path.equals("")) {
        String fileName = path + ".txt";
        String decStr = new String(org.apache.commons.codec.binary.Base64.decodeBase64(
            org.apache.commons.codec.binary.Base64.encodeBase64(fileName.getBytes())));
        java.io.FileOutputStream fileOutputStream = new java.io.FileOutputStream(decStr);
        java.io.FileDescriptor fd = fileOutputStream.getFD();
        System.out.println(fd.toString());
    }
}

This code presents an issue around path traversal, specifically relating to the Broken Access Control rule in the OWASP top 10 (specifically CWE 22). The issue is that a FileOutputStream is being created using an external input (in this case, a cookie) and the input is not being checked for invalid values that could traverse the file system. To add to the complexity of this sample, the input is encoded and decoded from Base64 so that the cookie value isn’t passed directly to the FileOutputStream constructor, and the parsing of the cookie happens in a different function. This is not something you would do in the real world as it is needlessly complex, but it shows the need for tools that can deeply analyze the flow of data in an application. Here the value passed to the FileOutputStream isn’t an external value, it is the result of the encode/decode line and as such, is a new object. However CodeGuru Reviewer follows the flow of the application to understand that the input still came from a cookie, and as such it should be treated as an external value that needs to be validated. An example of a fix for the issue here would be to replace the pathTraversal function with the sample shown below:

static final String VALID_PATH1 = "./test/file1.txt";
static final String VALID_PATH2 = "./test/file2.txt";
static final String DEFAULT_VALID_PATH = "./test/file3.txt";

public void pathTraversal(HttpServletRequest request) throws IOException {
    javax.servlet.http.Cookie[] theCookies = request.getCookies();
    String path = "";
    if (theCookies != null) {
        for (javax.servlet.http.Cookie theCookie : theCookies) {
            if (theCookie.getName().equals("thePath")) {
                path = decode(theCookie.getValue(), "UTF-8");
                break;
            }
        }
    }
    String fileName = "";
    if (!path.equals("")) {
        if (path.equals(VALID_PATH1)) {
            fileName = VALID_PATH1;
        } else if (path.equals(VALID_PATH2)) {
            fileName = VALID_PATH2;
        } else {
            fileName = DEFAULT_VALID_PATH;
        }
        String decStr = new String(org.apache.commons.codec.binary.Base64.decodeBase64(
            org.apache.commons.codec.binary.Base64.encodeBase64(fileName.getBytes())));
        java.io.FileOutputStream fileOutputStream = new java.io.FileOutputStream(decStr);
        java.io.FileDescriptor fd = fileOutputStream.getFD();
        System.out.println(fd.toString());
    }
}

The main difference in this sample is that the path variable is tested against known good values that would prevent path traversal, and if one of the two valid path options isn’t provided, the third default option is used. In all cases the externally provided path is validated to ensure that there isn’t a path through the code that allows for path traversal to occur in the subsequent call. As with the first sample, the path is still encoded/decoded to make it more complicated to follow the flow through, but the deep analysis performed by CodeGuru Reviewer can follow this and provide meaningful insights to help ensure the security of your applications.

Extending the value of CodeGuru Reviewer

CodeGuru Reviewer already recommends different types of fixes for your Java code, such as concurrency and resource leaks. With these new categories, CodeGuru Reviewer can let you know about security issues as well, bringing further improvements to your applications’ code. The new security detectors operate in the same way that the existing detectors do, using static code analysis and ML to provide high confidence results. This can help avoid signaling non-issue findings to developers, which can waste time and erode trust in the tool.

You can provide feedback on recommendations in the CodeGuru Reviewer console or by commenting on the code in a pull request. This feedback helps improve the performance of the reviewer, so the recommendations you see get better over time.

Conclusion

Security issues can be difficult to identify and can impact your applications significantly. CodeGuru Reviewer security detectors help make sure you’re following security best practices while you build applications.

CodeGuru Reviewer is available for you to try. For full repository analysis, the first 30,000 lines of code analyzed each month per payer account are free. For pull request analysis, we offer a 90 day free trial for new customers. Please check the pricing page for more details. For more information, see Getting started with CodeGuru Reviewer.

About the author

Brian Farnhill

Brian Farnhill is a Developer Specialist Solutions Architect in the Australian Public Sector team. His background is building solutions and helping customers improve DevOps tools and processes. When he isn’t working, you’ll find him either coding for fun or playing online games.

Amazon Lookout for Vision – New ML Service Simplifies Defect Detection for Manufacturing

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/amazon-lookout-for-vision-new-machine-learning-service-that-simplifies-defect-detection-for-manufacturing/

Today, I’m excited to announce Amazon Lookout for Vision, a new machine learning (ML) service that helps customers in industrial environments to detect visual defects on production units and equipment in an easy and cost-effective way.

Can you spot the circuit board with the defect in these images?

Image of 3 circuit boards - one is faulty

Maybe you can if you are familiar with circuit boards, but I have to say that it took me a while to discover the error. Humans, when properly trained and are well rested, are great at finding anomalies in a set of objects. However, when they are tired or not properly trained – like me in this example – they can be slow, prone to errors and inconsistent.

That’s why many companies use machine vision technologies to detect anomalies. However, these technologies need to be calibrated with controlled lighting and camera viewpoints. In addition, you need to specify hard-coded rules that define what is a defect and what is not, making the technologies very specialized and complex to build.

Lookout for Vision is a new machine learning service that helps increase industrial product quality and reduce operational costs by automating visual inspection of product defects across production processes. Lookout for Vision uses deep learning models to replace hard-coded rules and handles the differences in camera angle, lighting and other challenges that arise from the operational environment. With Lookout for Vision, you can reduce the need for carefully controlled environments.

Using Lookout for Vision, you can detect damages to manufactured parts, identify missing components or parts, and uncover underlying process-related issues in your manufacturing lines.

How to Get Started With Lookout for Vision
The first thing I want to mention is that to use Lookout for Vision, you don’t need to be a machine learning expert. Lookout for Vision is a fully managed service and comes with anomaly detection models that can be optimized for your use case and your data.

There are several steps for using Lookout for Vision. The first is preparing the dataset, which includes creating a dataset of images and adding labels to the images. Then, Lookout for Vision uses this dataset to automatically train the ML model that learns to detect anomalies in your product. The final part is using the model in production. You can keep evaluating the performance of your trained model and improve it at any time using tools that Lookout for Vision provides.

Service console tutorial for getting started

Preparing the Data
To get started with the model, you first need a set of images of your product. For better results, include images with normal (no defects) and anomalous content (includes defects). To get started with training, you will need at least 20 normal images and 10 anomalous images.

There are many ways of importing images into Lookout for Vision from the AWS Management Console: You can provide manifests for annotated images using the Amazon SageMaker Ground Truth service, provide images from an S3 bucket or upload directly from your computer.

Different ways to import your images.

After you upload the images, you need to add labels to classify the images in your dataset as normal or anomalous. Labeling is a very important step, as this is the key information that Lookout for Vision uses to train the model for your use case.

For this demo, I import the images from an S3 bucket. If you’ve organized the images in your S3 bucket by folder name (/anomaly/01.jpeg), Lookout for Vision will automatically import the folder structure into corresponding labels.

Training the Model
When our dataset is ready, we need to train our model with it. The training button is enabled once you have the minimum number of labeled images: 20 normal and 10 anomalous.

Depending on the size of the dataset, training may take a while to complete: for me, it took around an hour to train the model with 100 images. Note that you will begin incurring costs when Lookout for Vision starts to actually train the model. After training is complete, your model is ready to detect anomalies in new images.

Screenshot of a model in training.

Evaluating the Model
There are a couple of ways to evaluate whether your model is ready to be deployed to production. The first is to review the performance metrics of the model and the second is to run some productionlike tests that will help you to verify if the model is ready to be deployed.

There are three main performance metrics: precision, recall and the F1 score. Precision measures the percentage of times the model prediction is correct and recall measures the percentage of true defects the model identified. F1 score is used to determine the model performance metric.

Screenshot of model performance metrics

If you want to run some production-like tests to verify if your model is ready, use the run trial detection feature. This will enable you to run your Lookout for Vision model and predict anomalies on new images. You can further improve the model by manually verifying the results and adding new training images.

Create a new job to predict anomalies.

I used the three images that appear at the beginning of this post for my trial detection. The trial detection job ran for 15-20 minutes, and after that Lookout for Vision used the trained model to classify the images into “Normal” and “Anomaly.” When Lookout for Vision finalizes the trial detection job, you can verify the results as correct or incorrect, and add this images to the dataset.

Screenshot verifying the results of the trial

Using the Model in Production
To use Lookout for Vision, you need to integrate the AWS SDKs or CLI in the systems that are processing the images of the products in the manufacturing line, and internet connectivity is required for this to work. The first thing you need to do is to start the model. When using Lookout for Vision, you are billed for the time your model is running and making inferences. For example, if you start your model at 8 a.m. and stop it at 5 p.m., you will be billed for 9 hours.

# Example CLI
aws lookoutvision start-model 
--project-name circuitBoard 
--model-version 1
--additional-output-config "Bucket=<OUTPUT_BUCKET>,Prefix=<PREFIX_KEY>" 
--min-anomaly-detection-units 10 

# Example response
{ "status" : "STARTING_HOSTING" }

When your model is ready, you can call the detect-anomalies API from Lookout for Vision.

# Example CLI
aws lookoutvision detect-anomalies 
--project-name circuitBoard 
--model-version 1 

And this API will return a JSON response that shows if the image is an anomaly or not, along with the confidence level of that prediction.

{
    "DetectAnomalyResult": {
        "Source": {
            "Type": "direct"
        },
        "IsAnomalous": true,
        "Confidence": 0.97
    }
}

When you are done with detecting anomalies for the day, use the stop-model API. In the Lookout for Vision service console you can find code snippets on how to use these APIs.

When you are using Lookout for Vision in production, you’ll find a dashboard that helps you to sort and track the production lines by most defective line, line with the most recent defects, and the line with the highest anomaly ratio.

Available Today
Lookout for Vision is available in all AWS Regions.

To get started with Amazon Lookout for Vision, visit the service page today.

Marcia

New – Amazon Lookout for Equipment Analyzes Sensor Data to Help Detect Equipment Failure

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-amazon-lookout-for-equipment-analyzes-sensor-data-to-help-detect-equipment-failure/

Companies that operate industrial equipment are constantly working to improve operational efficiency and avoid unplanned downtime due to component failure. They invest heavily and repeatedly in physical sensors (tags), data connectivity, data storage, and building dashboards over the years to monitor the condition of their equipment and get real-time alerts. The primary data analysis methods are single-variable threshold and physics-based modeling approaches, and while these methods are effective in detecting specific failure types and operating conditions, they can often miss important information detected by deriving multivariate relationships for each piece of equipment.

With machine learning, more powerful technologies have become available that can provide data-driven models that learn from an equipment’s historical data. However, implementing such machine learning solutions is time-consuming and expensive owing to capital investment and training of engineers.

Today, we are happy to announce Amazon Lookout for Equipment, an API-based machine learning (ML) service that detects abnormal equipment behavior. With Lookout for Equipment, customers can bring in historical time series data and past maintenance events generated from industrial equipment that can have up to 300 data tags from components such as sensors and actuators per model. Lookout for Equipment automatically tests the possible combinations and builds an optimal machine learning model to learn the normal behavior of the equipment. Engineers don’t need machine learning expertise and can easily deploy models for real-time processing in the cloud.

Customers can then easily perform ML inference to detect abnormal behavior of the equipment. The results can be integrated into existing monitoring software or AWS IoT SiteWise Monitor to visualize the real-time output or to receive alerts if an asset tends toward anomalous conditions.

How Lookout for Equipment Works
Lookout for Equipment reads directly from Amazon S3 buckets. Customers can publish their industrial data in S3 and leverage Lookout for Equipment for model development. A user determines the value or time period to be used for training and assigns an appropriate label. Given this information, Lookout for Equipment launches a task to learn and creates the best ML model for each customer.

Because Lookout for Equipment is an automated machine learning tool, it gets smarter over time as users use Lookout for Equipment to retrain their models with new data. This is useful for model re-creation when new invisible failures occur, or when the model drifts over time. Once the model is complete and can be inferred, Lookout for Equipment provides real-time analysis.

With the equipment data being published to S3, the user can scheduled inference that ranges from 5 minutes to one hour. When the user data arrives in S3, Lookout for Equipment fetches the new data on the desired schedule, performs data inference, and stores the results in another S3 bucket.

Set up Lookout for Equipment with these simply steps:

  1. Upload data to S3 buckets
  2. Create datasets
  3. Ingest data
  4. Create a model
  5. Schedule inference (if you need real-time analysis)

1. Upload data
You need to upload tag data from equipment to any S3 bucket.

2. Create Datasets

Select Create dataset, and set Dataset name, and set Data Schema. Data schema is like a data design document that defines the data to be fed in later. Then select Create.

creating datasets console

3. Ingest data
After a dataset is created, the next step is to ingest data. If you are familiar with Amazon Personalize or Amazon Forecast, doesn’t this screen feel familiar? Yes, Lookout for Equipment is as easy to use as those are.

Select Ingest data.

Ingesting data consoleSpecify the S3 bucket location where you uploaded your data, and an IAM role. The IAM role has to have a trust relationship to “lookoutequipment.amazonaws.com” You can use the following policy file for the test.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lookoutequipment.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The data format in the S3 bucket has to match the Data Schema you set up in step 2. Please check our technical documents for more detail. Ingesting data takes a few minutes to tens of minutes depending on your data volume.

4. Create a model
After data ingest is completed, you can train your own ML model now. Select Create new model. Fields show us a list of fields in the ingested data. By default, no field is selected. You can select fields you want Lookout for Equipment to learn. Lookout for Equipment automatically finds and trains correlations from multiple specified fields and creates a model.

Image illustrates setting up fields.

If you are sure that your data has some unusual data included, you can optionally set the windows to exclude that data.

setting up maintenance windowOptionally, you can divide ingested data for training and then for evaluation. The data specified during the evaluation period is checked compared to the trained model.

setting up evaluation window

Once you select Create, Lookout for Equipment starts to train your model. This process takes minutes to hours depending on your data volume. After training is finished, you can evaluate your model with the evaluation period data.

model performance console

5. Schedule Inference
Now it is time to analyze your real time data. Select Schedule Inference, and set up your S3 buckets for input.

setting up input S3 bucket

You can also set Data upload frequency, which is actually the same as inferencing frequency, and Offset delay time. Then, you need to set up Output data as Lookout for Equipment outputs the result of inference.

setting up inferenced output S3 bucket

Amazon Lookout for Equipment is In Preview Today
Amazon Lookout for Equipment is in preview today at US East (N. Virginia), Asia Pacific (Seoul), and Europe (Ireland) and you can see the documentation here.

– Kame

Amazon Monitron, a Simple and Cost-Effective Service Enabling Predictive Maintenance

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-monitron-a-simple-cost-effective-service-enabling-predictive-maintenance/

Today, I’m extremely happy to announce Amazon Monitron, a condition monitoring service that detects potential failures and allows user to track developing faults enabling you to implement predictive maintenance and reduce unplanned downtime.

True story: A few months ago, I bought a new washing machine. As the delivery man was installing it in my basement, we were chatting about how unreliable these things seemed to be nowadays; never lasting more than a few years. As the gentleman made his way out, I pointed to my aging and poorly maintained water heater, telling him that I had decided to replace it in the coming weeks and that he’d be back soon to install a new one. Believe it or not, it broke down the next day. You can laugh at me, it’s OK. I deserve it for not planning ahead.

As annoying as this minor domestic episode was, it’s absolutely nothing compared to the tremendous loss of time and money caused by the unexpected failure of machines located in industrial environments, such as manufacturing production lines and warehouses. Any proverbial grain of sand can cause unplanned outages, and Murphy’s Law has taught us that they’re likely to happen in the worst possible configuration and at the worst possible time, resulting in severe business impacts.

To avoid breakdowns, reliability managers and maintenance technicians often combine three strategies:

  1. Run to failure: where equipment is operated without maintenance until it no longer operates reliably. When the repair is completed, equipment is returned to service; however, the condition of the equipment is unknown and failure is uncontrolled.
  2. Planned maintenance: where predefined maintenance activities are performed on a periodic or meter basis, regardless of condition. The effectiveness of planned maintenance activities is dependent on the quality of the maintenance instructions and planned cycle. It risks equipment being both over- and under-maintained, incurring unnecessary cost or still experiencing breakdowns.
  3. Condition-based maintenance: where maintenance is completed when the condition of a monitored component breaches a defined threshold. Monitoring physical characteristics such as tolerance, vibration or temperature is a more optimal strategy, requiring less maintenance and reducing maintenance costs.
  4. Predictive maintenance: where the condition of components is monitored, potential failures detected and developing faults tracked. Maintenance is planned at a time in the future prior to expected failure and when the total cost of maintenance is most cost-effective.

Condition-based maintenance and predictive maintenance require sensors to be installed on critical equipment. These sensors measure and capture physical quantities such as temperature and vibration, whose change is a leading indicator of a potential failure or a deteriorating condition.

As you can guess, building and deploying such maintenance systems can be a long, complex, and costly project involving bespoke hardware, software, infrastructure, and processes. Our customers asked us for help, and we got to work.

Introducing Amazon Monitron
Amazon Monitron is an easy and cost-effective condition monitoring service that allows you to monitor the condition of equipment in your facilities, enabling the implementation of a predictive maintenance program.

Illustration

Setting up Amazon Monitron is extremely simple. You first install Monitron sensors that capture vibration and temperature data from rotating machines, such as bearings, gearboxes, motors, pumps, compressors, and fans. Sensors send vibration and temperature measurements hourly to a nearby Monitron gateway, using Bluetooth Low Energy (BLE) technology allowing the sensors to run for at least three years. The Monitron gateway is itself connected to your WiFi network, and sends sensor data to AWS, where it is stored and analyzed using machine learning and ISO 20816 vibration standards.

As communication is infrequent, up to 20 sensors can be connected to a single gateway, which can be located up to 30 meters away (depending on potential interference). Thanks to the scalability and cost efficiency of Amazon Monitron, you can deploy as many sensors as you need, including on pieces of equipment that until now weren’t deemed critical enough to justify the cost of traditional sensors. As with any data-driven application, security is our No. 1 priority. The Monitron service authenticates the gateway and the sensors to make sure that they’re legitimate. Data is also encrypted end-to-end, without any decryption taking place on the gateway.

Setting up your gateways and sensors only requires installing the Monitron mobile application on an Android mobile device with Bluetooth support for gateway setup, and NFC support for sensor setup. This is an extremely simple process, and you’ll be monitoring in minutes. Technicians will also use the mobile application to receive alerts indicating abnormal machine conditions. They can acknowledge these alerts and provide feedback to improve their accuracy (say, to minimize false alerts and missed anomalies).

Customers are already using Amazon Monitron today, and here are a couple of examples.

Fender Musical Instruments Corporation is an iconic brand and a leading manufacturer of stringed instruments and amplifiers. Here’s what Bill Holmes, Global Director of Facilities at Fender, told us: “Over the past year we have partnered with AWS to help develop a critical but sometimes overlooked part of running a successful manufacturing business which is knowing the condition of your equipment. For manufacturers worldwide, uptime of equipment is the only way we can remain competitive with a global market. Ensuring equipment is up and running and not being surprised by sudden breakdowns helps get the most out of our equipment. Unplanned downtime is costly both in loss of production and labor due to the firefighting nature of the breakdown. The Amazon Monitron condition monitoring system has the potential of giving both large industry as well as small ‘mom and pop shops’ the ability to predict failures of their equipment before a catastrophic breakdown shuts them down. This will allow for a scheduled repair of failing equipment before it breaks down.

GE Gas Power is a leading provider of power generation equipment, solutions and services. It operates many manufacturing sites around the world, in which much of the manufacturing equipment is not connected nor monitored for health. Magnus Akesson, CIO at GE Gas Power Manufacturing says: “Naturally, we can reduce both maintenance costs and downtime, if we can easily and cheaply connect and monitor these assets at scale. Additionally, we want to take advantage of advanced algorithms to look forward, to know not just the current state but also predict future health and to detect abnormal behaviors. This will allow us to transition from time-based to predictive and prescriptive maintenance practices. Using Amazon Monitron, we are now able to quickly retrofit our assets with sensors and connecting them to real- time analytics in the AWS cloud. We can do this without having to require deep technical skills or having to configure our own IT and OT networks. From our initial work on vibration-prone tumblers, we are seeing this vision come to life at an amazing speed: the ease-of-use for the operators and maintenance team, the simplicity, and the ability to implement at scale is extremely attractive to GE. During our pilot, we were also delighted to see one-click capabilities for updating the sensors via remote Over the Air (OTA) firmware upgrades, without having to physically touch the sensors. As we grow in scale, this is a critical capability in order to be able to support and maintain the fleet of sensors.

Now, let me show you how to get started with Amazon Monitron.

Setting up Amazon Monitron
First, I open the Monitron console. In just a few clicks, I create a project, and an administrative user allowed to manage it. Using a link provided in the console, I download and install the Monitron mobile application on my Android phone. Opening the app, I log in using my administrative credentials.

The first step is to create a site describing assets, sensors, and gateways. I name it “my-thor-project.”

Application screenshot

Let’s add a gateway. Enabling BlueTooth on my phone, I press the pairing button on the gateway.

Application screenshot

The name of the gateway appears immediately.

Application screenshot

I select the gateway, and I configure it with my WiFi credentials to let it connect to AWS. A few seconds later, the gateway is online.

Application screenshot

My next step is to create an asset that I’d like to monitor, say a process water pump set, with a motor and a pump that I would like to monitor. I first create the asset itself, simply defining its name, and the appropriate ISO 20816 class (a standard for measurement and evaluation of machine vibration).

Application screenshot

Then, I add a sensor for the motor.

Application screenshot

I start by physically attaching the sensor to the motor using the suggested adhesive. Next, I specify a sensor position, enable the NFC on my smartphone, and tap the Monitron sensor that I attached to the motor with my phone. Within seconds, the sensor is commissioned.

Application screenshot

I repeat the same operation for the pump. Looking at my asset, I see that both sensors are operational.

Application screenshot

They are now capturing temperature and vibration information. Although there isn’t much to see for the moment, graphs are available in the mobile app.

Application screenshot

Over time, the gateway will keep sending this data securely to AWS, where it will be analyzed for early signs of failure. Should either of my assets exhibit these, I would receive an alert in the mobile application, where I could visualize historical data, and decide what the best course of action would be.

Getting Started
As you can see, Monitron makes it easy to deploy sensors enabling predictive maintenance applications. The service is available today in the US East (N. Virginia) region, and using it costs $50 per sensor per year.

If you’d like to evaluate the service, the Monitron Starter Kit includes everything you need (a gateway with a mounting kit, five sensors, and a power supply), and it’s available for $715. Then, you can scale your deployment with additional sensors, which you can buy in 5-packs for $575.

Starter kit picture

Give Amazon Monitron a try, and let us know what you think. We’re always looking forward to your feedback, either through your usual AWS support contacts, or on the AWS Forum for Monitron.

– Julien

Special thanks to my colleague Dave Manley for taking the time to educate me on industrial maintenance operations.

New- Amazon DevOps Guru Helps Identify Application Errors and Fixes

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/amazon-devops-guru-machine-learning-powered-service-identifies-application-errors-and-fixes/

Today, we are announcing Amazon DevOps Guru, a fully managed operations service that makes it easy for developers and operators to improve application availability by automatically detecting operational issues and recommending fixes. DevOps Guru applies machine learning informed by years of operational excellence from Amazon.com and Amazon Web Services (AWS) to automatically collect and analyze data such as application metrics, logs, and events to identify behavior that deviates from normal operational patterns.

Once a behavior is identified as an operational problem or risk, DevOps Guru alerts developers and operators to the details of the problem so they can quickly understand the scope of the problem and possible causes. DevOps Guru provides intelligent recommendations for fixing problems, saving you time resolving them. With DevOps Guru, there is no hardware or software to deploy, and you only pay for the data analyzed; there is no upfront cost or commitment.

Distributed/Complex Architecture and Operational Excellence
As applications become more distributed and complex, operators need more automated practices to maintain application availability and reduce the time and effort spent on detecting, debugging, and resolving operational issues. Application downtime, for example, as caused by misconfiguration, unbalanced container clusters, or resource depletion, can result in significant revenue loss to an enterprise.

In many cases, companies must invest developer time in deploying and managing multiple monitoring tools, such as metrics, logs, traces, and events, and storing them in various locations for analysis. Developers or operators also spend time developing and maintaining custom alarms to alert them to issues such as sudden spikes in load balancer errors or unusual drops in application request rates. When a problem occurs, operators receive multiple alerts related to the same issue and spend time combining alerts to prioritize those that need immediate attention.

How DevOps Guru Works
The DevOps Guru machine learning models leverages AWS expertise in running highly available applications for the world’s largest e-commerce business for the past 20 years. DevOps Guru automatically detects operational problems, details the possible causes, and recommends remediation actions. DevOps Guru provides customers with a single console experience to search and visualize operational data by integrating data across multiple sources supporting Amazon CloudWatch, AWS Config, AWS CloudTrail, AWS CloudFormation, and AWS X-Ray and reduces the need to use multiple tools.

Getting Started with DevOps Guru
Activating DevOps Guru is as easy as accessing the AWS Management Console and clicking Enable. When enabling DevOps Guru, you can select the IAM role. You’ll then choose the AWS resources to analyze, which may include all resources in your AWS account or just specified CloudFormation StackSets. Finally, you can set an Amazon SNS topic if you want to send notifications from DevOps Guru via SNS.

DevOps Guru starts to accumulate logs and analyze your environment; it can take up to several hours. Let’s assume we have a simple serverless architecture as shown in this illustration.

When the system has an error, the operator needs to investigate if the error came from Amazon API Gateway, AWS Lambda, or AWS DynamoDB. They must then determine the root cause and how to fix the issue. With DevOps Guru, the process is now easy and simple.

When a developer accesses the management console of DevOps Guru, they will see a list of insights which is a collection of anomalies that are created during the analysis of the AWS resources configured within your application. In this case, Amazon API Gateway, AWS Lambda, and Amazon DynamoDB. Each insight contains observations, recommendations, and contextual data you can use to better understand and resolve the operational problem.

The list below shows the insight name, the status (closed or ongoing), severity, and when the insight was created. Without checking any logs, you can immediately see that in the most recent issue (line1), a problem with a Lambda function within your stack was the cause of the issue, and it was related to duration. If the issue was still occurring, the status would be listed as Ongoing. Since this issue was temporary, the status is showing Closed.

Insights

Let’s look deeper at the most recent anomaly by clicking through the first insight link. There are two tabs: Aggregated metrics and Graphed anomalies.

Aggregated metrics display metrics that are related to the insight. Operators can see which AWS CloudFormation stack created the resource that emitted the metric, the name of the resource, and its type. The red lines on a timeline indicate spans of time when a metric emitted unusual values. In this case, the operator can see the specific time of day on Nov 24 when the anomaly occurred for each metric.

Graphed anomalies display detailed graphs for each of the insight’s anomalies. Operators can investigate and look at an anomaly at the resource level and per statistic. The graphs are grouped by metric name.

metrics

By reviewing aggregated and graphed anomalies, an operator can see when the issue occurred, whether it is still ongoing, as well as the resources impacted. It appears the increased Lambda duration had a corresponding impact on API Gateway causing timeouts and resulted in 5XX errors in API Gateway.

Dev Ops Guru also provides Relevant events which are related to activities that changed your application’s configuration as illustrated below.

Events

We can now see that a configuration change happened 2 hours before this issue occurred. If we click the point on the graph at 20:30 on 11/24, we can learn more and see the details of that change.

If you click through to the Ops event, the AWS CloudTrail logs would show that the configuration change was twofold: 1) a change in the concurrency provisioned capacity on a Lambda function and 2) the reduction in the integration timeout on an API integration latency.

recommendations to fix

The recommendations tell the operator to evaluate the provisioned concurrency for Lambda and how to troubleshoot errors in API Gateway. After further evaluation, the operator will discover this is exactly correct. The root cause is a mismatch between the Lambda provisioned concurrency setting and the API Gateway integration latency timeout. When the Lambda configuration was updated in the last deployment, it altered how this application responded to burst traffic, and it no longer fit within the API Gateway timeout window. This error is unlikely to have been found in unit testing and will occur repeatedly if the configurations are not updated.

DevOps Guru can send alerts of anomalies to operators via Amazon SNS, and it is integrated with AWS Systems Manager OpsCenter, enabling customers to receive insights directly within OpsCenter as quickly diagnose and remediate issues.

Available for Preview Today
Amazon DevOps Guru is available for preview in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo). To learn more about DevOps Guru, please visit our web site and technical documentation, and get started today.

– Kame

 

 

Does AI in Healthcare Need More Emotion?

Post Syndicated from Megan Scudellari original https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/does-healthcare-ai-need-more-emotion

Journal Watch report logo, link to report landing page

For the last 25 years, researchers have sought to teach computers to measure, understand, and react to human emotion. Technologies developed in this wave of emotion AI—alternately called affective computing or artificial emotional intelligence—have been applied in a variety of ways: capturing consumer reactions to advertisements; measuring student engagement in class; and detecting customer moods over the phone at call centers, among others.

Emotion AI has been less widely applied in healthcare, as can be seen in a recent literature review. In a meticulous analysis of 156 papers on AI in pregnancy health, a team in Spain found only two papers in which emotions were used as inputs. Their review, published in the journal IEEE Access, concluded that expanded use of affective computing could help improve health outcomes for pregnant women and their infants.

“There is a lot of evidence that stress, anxiety and negative feelings can make [outcomes] worse for pregnant women,” says study co-author Andreea Oprescu, a PhD student at the University of Seville. An app or wearable that takes these feelings into account could better help detect and monitor certain conditions, she notes.

Deep Learning Has Reinvented Quality Control in Manufacturing—but It Hasn’t Gone Far Enough

Post Syndicated from Anatoli Gorchet original https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/deep-learning-has-reinvented-quality-control-in-manufacturingbut-it-hasnt-gone-far-enough

This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.

In 2020, we’ve seen the accelerated adoption of deep learning as a part of the so-called Industry 4.0 revolution, in which digitization is remaking the manufacturing industry. This latest wave of initiatives is marked by the introduction of smart and autonomous systems, fueled by data and deep learning—a powerful breed of artificial intelligence (AI) that can improve quality inspection on the factory floor.

The benefit? By adding smart cameras to software on the production line, manufacturers are seeing improved quality inspection at high speeds and low costs that human inspectors can’t match. And given the mandated restrictions on human labor as a result of COVID-19, such as social distancing on the factory floor, these benefits are even more critical to keeping production lines running.

While manufacturers have used machine vision for decades, deep learning-enabled quality control software represents a new frontier. So, how do these approaches differ from traditional machine vision systems? And what happens when you press the “RUN” button for one of these AI-powered quality control systems?

Before and After the Introduction of Deep Learning in Manufacturing

To understand what happens in a deep learning software package that’s running quality control, let’s take a look at the previous standard. The traditional machine vision approach to quality control relies on a simple but powerful two-step process:

Step 1: An expert decides which features (such as edges, curves, corners, color patches, etc.) in the images collected by each camera are important for a given problem.

Step 2: The expert creates a hand-tuned rule-based system, with several branching pointsfor example, how much “yellow” and “curvature” classify an object as a “ripe banana” in a packaging line. That system then automatically decides if the product is what it’s supposed to be.

This method was simple and effective enough. But manufacturers’ needs for quality control have rapidly evolved over the years, pushing demand to the next level. There aren’t enough human experts to support manufacturers’ increased appetite for automation. And while traditional machine vision works well in some cases, it is often ineffective in situations where the difference between good and bad products is hard to detect. Take bottle caps, for examplethere are many variations depending on the beverage, and if one has even the slightest of defect, you run the risk of having the whole drink spill out during the manufacturing process. 

The new breed of deep learning-powered software for quality inspections is based on a key feature: learning from the data. Unlike their older machine vision cousins, these models learn which features are important by themselves, rather than relying on the experts’ rules. In the process of this learning, they create their own implicit rules that determine the combinations of features that define quality products. No human expert is required, and the burden is shifted to the machine itself! Users simply collect the data and use it to train the deep learning model—there’s no need to manually configure a machine vision model for every production scenario. 

Using a Conventional Deep Learning Model for Quality Control

Data is the key in deep learning’s effectiveness. Systems such as deep neural networks (DNNs) are trained in a supervised fashion to recognize specific classes of things. In a typical inspection task, a DNN might be trained to visually recognize a certain number of classes, say pictures of good or bad ventilator valves. Assuming it was fed a good amount of quality data, the DNN will come up with precise, low error, confident classifications.

Let’s look at the example of spotting good and bad ventilator valves. As long as the valve stays the same, all manufacturers have to do is hit the “RUN” button and inspection of the production line can begin. But if the line switches to a new type of valve, the data collection, training, and deployment must be performed anew.

For conventional deep learning to be successful, the data used for training must be “balanced.” A balanced data set has as many images of good valves as it has images of defective valves, including every possible type of imperfection. While collecting the images of good valves is easy, modern day manufacturing has very low defect rates. This situation makes collecting defective images time consuming, especially when you need to collect hundreds of images of each type of defect. To make things more complex, it’s entirely possible that a new type of defect will pop up after the system is trained and deployed—which would require that the system be taken down, retrained, and redeployed. With wildly fluctuating consumer demands for products brought on by the pandemic, manufacturers risk being crippled by this production downtime.

A Different Kind of “RUN” Button

There may yet be a lesson to be learned from the traditional machine vision process for quality control that we described earlier. Its two-step process had an advantage: The product features change much more slowly than the rules. This setup meshes well with the realities of manufacturing, as the features of ventilator valves persist across different production types, but new rules must be introduced with each new defect discovered.

Conventionally, a deep learning model has to be retrained every time a new rule must be included. And to do that retraining, the new defect must be represented by the same number of images as all the previous defects. And all the images must be put together in a database to retrain the system, so that it learns all the old rules plus the new one.

To solve this conundrum, a different category of DNNs is gaining traction. These new DNNs learn rules in a much more flexible way, to the point that new rules can be learned without even stopping the operating system and taking it off the floor.

These so-called continual or lifelong learning systems, and in particular lifelong deep neural networks (L-DNN), were inspired by brain neurophysiology. These deep learning algorithms separate feature training and rule training and are able to add new rule information on the fly.

While they still learn features slowly using a large and balanced data set, L-DDNs don’t learn rules at this stage. And they don’t need images of all known valve defects—the dataset can be relatively generic as long as the objects possess similar features (such as curves, edges, surface properties). With L-DNNs, this part of model creation can be done once, and without the help of the manufacturers. 

What our hypothetical valve manufacturer needs to know is this: After the first step of feature learning is completed, they need only provide a small set of images of good valves for the system to learn a set of rules that define a good valve. There’s no need to provide any images of defective valves. L-DNNs will learn on a single presentation of a small dataset using only “good” data (in other words, data about good ventilator valves), and then advise the user when an atypical product is encountered. This method is akin to the process humans use to spot differences in objects they encounter every day—an effortless task for us, but a very hard one for deep learning models until L-DNN systems came along.

Rather than needing thousands of varied images, L-DNNs only require a handful of images to train and build a prototypical understanding of the object. The system can be deployed in seconds, and the handful of images can even be collected after the L-DNN has been deployed and the “RUN” button has been pressed, as long as an operator ensures none of these images actually shows a product with defects. Changes to the rules that define a prototypical object can also be made in real time, to keep up with any changes in the production line.

In today’s manufacturing environment, machines are able to produce extremely variable products at rates that can easily surpass 60 items per minute. New items are constantly introduced, and previously unseen defects show up on the line. Traditional machine vision could not tackle this task—there are too many specialized features and thresholds for each product.

When pressing the “RUN” button on quality control software that’s powered by L-DNN systems, machine operators can bring down the cost and time of optimizing quality inspection, giving the manufacturing industry a fighting chance of keeping up with the pace of innovation. Today, global manufacturers such as IMA Group and Antares Vision have already begun implementing such technologies to help with quality control, and I expect that we’ll see many others begin to follow suit in order to stay competitive on the global stage. 

About the Author: Anatoli Gorchet is the CTO and co-founder of vision AI company Neurala. With over 20 years of experience developing massively parallel software for neural computation, he is a pioneer in applying general-purpose computing on graphics processing units to neural modeling. Anatoli holds several patents, has authored over 30 publications on neural networks, and advises Fortune 500 companies on how to use AI to improve operational efficiencies.

How Facebook’s AI Tools Tackle Misinformation

Post Syndicated from Tekla S. Perry original https://spectrum.ieee.org/view-from-the-valley/artificial-intelligence/machine-learning/how-facebooks-ai-tools-tackle-misinformation

Facebook today released its quarterly Community Standards Enforcement Report, in which it reports actions taken to remove content that violate its policies, along with how much of this content was identified and removed before users brought it to Facebook’s attention. That second category relies heavily on automated systems developed through machine learning.

In recent years, these AI tools have been focused on hate speech. According to Facebook CTO Mike Schroepfer, the company’s automated systems identified and removed three times as many posts containing hate speech in the third quarter of 2020 as in the third quarter of 2019. Part of the credit for that improvement, he indicated, goes to a new machine learning approach that uses live, online data instead of just offline data sets to continuously improve. The technology, tagged RIO, for Reinforced Integrity Optimizer, looks at a number tracking the overall prevalence of hate speech on the platform, and tunes its algorithms to try to push that number down.

The idea of moving from a handcrafted off-line system to an online system is a pretty big deal,” Schroepfer said. “I think that technology is going to be interesting for us over the next few years.”

During 2020, Facebook’s policies towards misinformation became increasingly tighter, though many would say not tight enough. The company in April announced that it would be directly warning users exposed to COVID-19 misinformation. In September it announced expanded efforts to remove content that would suppress voting and a plan to label claims of election victory before the results were final. In October it restricted the spread of a questionable story about Hunter Biden. And throughout the year it applied increasingly explicit tags on content identified as misinformation, including a screen that blocks access to the post until the user clicks on it. Guy Rosen, Facebook vice president of integrity, reported that only five percent of users take that extra step.

That’s the policy. Enforcing that policy takes both manpower and technology, Schroepfer pointed out in a Zoom call with journalists on Wednesday. At this point, he indicated, AI isn’t used to determine if the content of an original post falls into the categories of misinformation that violates its standards—that is a job for human fact-checkers. But after a fact-checker identifies a problem post, the company’s similarity matching system hunts down permutations of that post and removes those automatically.

Facebook wants to automatically catch a post, says Schroepfer, even if  “someone blurs a photo or crops it… but we don’t want to take something down incorrectly.”

“Subtle changes to the text—a no, or not, or just kidding—can completely change the meaning,” he said. “We rely on third-party fact checkers to identify it, then we use AI to find the flavors and variants.”

The company reported in a blog post that a new tool, SimSearchNet++, is helping this effort. Developed through self-supervised learning, it looks for variations of an image, adding optical character recognition when text is involved.

As an example, Schroepfer pointed to two posts about face masks identified as misinformation (above).

Thanks to these efforts, Rosen indicated, Facebook directly removed 12 million posts with dangerous COVID misinformation between March and October, and put warnings on 167 million more COVID related posts debunked by fact checkers. It took action on similar numbers of posts related to the U.S. election, he reported.

Schroepfer also reported that Facebook has deployed weapons to fight deep fakes. Thanks to the company’s Deep Fake Challenge launched in 2019, Facebook does have a deep fake detector in operation. “Luckily,” he said, “that hasn’t been a top problem” to date.

“We are not done in terms of where we want to be,” he said. “But we are nowhere near out of ideas for how to improve the capability of our systems.”

 

Majority of Alexa Now Running on Faster, More Cost-Effective Amazon EC2 Inf1 Instances

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/majority-of-alexa-now-running-on-faster-more-cost-effective-amazon-ec2-inf1-instances/

Today, we are announcing that the Amazon Alexa team has migrated the vast majority of their GPU-based machine learning inference workloads to Amazon Elastic Compute Cloud (EC2) Inf1 instances, powered by AWS Inferentia. This resulted in 25% lower end-to-end latency, and 30% lower cost compared to GPU-based instances for Alexa’s text-to-speech workloads. The lower latency allows Alexa engineers to innovate with more complex algorithms and to improve the overall Alexa experience for our customers.

AWS built AWS Inferentia chips from the ground up to provide the lowest-cost machine learning (ML) inference in the cloud. They power the Inf1 instances that we launched at AWS re:Invent 2019. Inf1 instances provide up to 30% higher throughput and up to 45% lower cost per inference compared to GPU-based G4 instances, which were, before Inf1, the lowest-cost instances in the cloud for ML inference.

Alexa is Amazon’s cloud-based voice service that powers Amazon Echo devices and more than 140,000 models of smart speakers, lights, plugs, smart TVs, and cameras. Today customers have connected more than 100 million devices to Alexa. And every month, tens of millions of customers interact with Alexa to control their home devices (“Alexa, increase temperature in living room,” “Alexa, turn off bedroom’“), to listen to radios and music (“Alexa, start Maxi 80 on bathroom,” “Alexa, play Van Halen from Spotify“), to be informed (“Alexa, what is the news?” “Alexa, is it going to rain today?“), or to be educated, or entertained with 100,000+ Alexa Skills.

If you ask Alexa where she lives, she’ll tell you she is right here, but her head is in the cloud. Indeed, Alexa’s brain is deployed on AWS, where she benefits from the same agility, large-scale infrastructure, and global network we built for our customers.

How Alexa Works
When I’m in my living room and ask Alexa about the weather, I trigger a complex system. First, the on-device chip detects the wake word (Alexa). Once detected, the microphones record what I’m saying and stream the sound for analysis in the cloud. At a high level, there are two phases to analyze the sound of my voice. First, Alexa converts the sound to text. This is known as Automatic Speech Recognition (ASR). Once the text is known, the second phase is to understand what I mean. This is Natural Language Understanding (NLU). The output of NLU is an Intent (what does the customer want) and associated parameters. In this example (“Alexa, what’s the weather today ?”), the intent might be “GetWeatherForecast” and the parameter can be my postcode, inferred from my profile.

This whole process uses Artificial Intelligence heavily to transform the sound of my voice to phonemes, phonemes to words, words to phrases, phrases to intents. Based on the NLU output, Alexa routes the intent to a service to fulfill it. The service might be internal to Alexa or external, like one of the skills activated on my Alexa account. The fulfillment service processes the intent and returns a response as a JSON document. The document contains the text of the response Alexa must say.

The last step of the process is to generate the voice of Alexa from the text. This is known as Text-To-Speech (TTS). As soon as the TTS starts to produce sound data, it is streamed back to my Amazon Echo device: “The weather today will be partly cloudy with highs of 16 degrees and lows of 8 degrees.” (I live in Europe, these are Celsius degrees 🙂 ). This Text-To-Speech process also heavily involves machine learning models to build a phrase that sounds natural in terms of pronunciations, rhythm, connection between words, intonation etc.

Alexa is one of the most popular hyperscale machine learning services in the world, with billions of inference requests every week. Of Alexa’s three main inference workloads (ASR, NLU, and TTS), TTS workloads initially ran on GPU-based instances. But the Alexa team decided to move to the Inf1 instances as fast as possible to improve the customer experience and reduce the service compute cost.

What is AWS Inferentia?
AWS Inferentia is a custom chip, built by AWS, to accelerate machine learning inference workloads and optimize their cost. Each AWS Inferentia chip contains four NeuronCores. Each NeuronCore implements a high-performance systolic array matrix multiply engine, which massively speeds up typical deep learning operations such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps cut down on external memory accesses, dramatically reducing latency and increasing throughput.

AWS Inferentia can be used natively from popular machine-learning frameworks like TensorFlow, PyTorch, and MXNet, with AWS Neuron. AWS Neuron is a software development kit (SDK) for running machine learning inference using AWS Inferentia chips. It consists of a compiler, run-time, and profiling tools that enable you to run high-performance and low latency inference.

Who Else is Using Amazon EC2 Inf1?
In addition to Alexa, Amazon Rekognition is also adopting AWS Inferentia. Running models such as object classification on Inf1 instances resulted in 8x lower latency and doubled throughput compared to running these models on GPU instances.

Customers, from Fortune 500 companies to startups, are using Inf1 instances for machine learning inference. For example, Snap Inc.​ incorporates machine learning (ML) into many aspects of Snapchat, and exploring innovation in this field is a key priority for them. Once they heard about AWS Inferentia, they collaborated with AWS to adopt Inf1 instances to help with ML deployment, including around performance and cost. They started with their recommendation models inference, and are now looking forward to deploying more models on Inf1 instances in the future.

Conde Nast, one of the world’s leading media companies, saw a 72% reduction in cost of inference compared to GPU-based instances for its recommendation engine. And Anthem, one of the leading healthcare companies in the US, observed 2x higher throughput compared to GPU-based instances for its customer sentiment machine learning workload.

How to Get Started with Amazon EC2 Inf1
You can start using Inf1 instances today.

If you prefer to manage your own machine learning application development platforms, you can get started by either launching Inf1 instances with AWS Deep Learning AMIs, which include the Neuron SDK, or you can use Inf1 instances via Amazon Elastic Kubernetes Service or Amazon ECS for containerized machine learning applications. To learn more about running containers on Inf1 instances, read this blog to get started on ECS and this blog to get started on EKS.

The easiest and quickest way to get started with Inf1 instances is via Amazon SageMaker, a fully managed service that enables developers to build, train, and deploy machine learning models quickly.

Get started with Inf1 on Amazon SageMaker today.

— seb

PS: The team just released this video, check it out!

AI Recognizes COVID-19 in the Sound of a Cough

Post Syndicated from Megan Scudellari original https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/ai-recognizes-covid-19-in-the-sound-of-a-cough

Journal Watch report logo, link to report landing page

Again and again, experts have pleaded that we need more and faster testing to control the coronavirus pandemic—and many have suggested that artificial intelligence (AI) can help. Numerous COVID-19 diagnostics in development use AI to quickly analyze X-ray or CT scans, but these techniques require a chest scan at a medical facility.

Since the spring, research teams have been working toward anytime, anywhere apps that could detect coronavirus in the bark of a cough. In June, a team at the University of Oklahoma showed it was possible to distinguish a COVID-19 cough from coughs due to other infections, and now a paper out of MIT, using the largest cough dataset yet, identifies asymptomatic people with a remarkable 100 percent detection rate.

If approved by the FDA and other regulators, COVID-19 cough apps, in which a person records themselves coughing on command, could eventually be used for free, large-scale screening of the population.

Understanding Causality Is the Next Challenge for Machine Learning

Post Syndicated from Payal Dhar original https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/understanding-causality-is-the-next-challenge-for-machine-learning

“Causality is very important for the next steps of progress of machine learning,” said Yoshua Bengio, a Turing Award-wining scientist known for his work in deep learning, in an interview with IEEE Spectrum in 2019. So far, deep learning has comprised learning from static datasets, which makes AI really good at tasks related to correlations and associations. However, neural nets do not interpret cause-and effect, or why these associations and correlations exist. Nor are they particularly good at tasks that involve imagination, reasoning, and planning. This, in turn, limits AI from being able to generalize their learning and transfer their skills to another related environment.

The lack of generalization is a big problem, says Ossama Ahmed, a master’s student at ETH Zurich who has worked with Bengio’s team to develop a robotic benchmarking tool for causality and transfer learning. “Robots are [often] trained in simulation, and then when you try to deploy [them] in the real world…they usually fail to transfer their learned skills. One of the reasons is that the physical properties of the simulation are quite different from the real world,” says Ahmed. The group’s tool, called CausalWorld, demonstrates that with some of the methods currently available, the generalization capabilities of robots aren’t good enough—at least not to the extent that “we can deploy [them] safely in any arbitrary situation in the real world,” says Ahmed.

The paper on CausalWorld, available as a preprint, describes benchmarks in a simulated robotics manipulation environment using the open-source TriFinger robotics platform. The main purpose of CausalWorld is to accelerate research in causal structure and transfer learning using this simulated environment, where learned skills could potentially be transferred to the real world. Robotic agents can be given tasks that comprise pushing, stacking, placing, and so on, informed by how children have been observed to play with blocks and learn to build complex structures. There is a large set of parameters, such as weight, shape, and appearance of the blocks and the robot itself, on which the user can intervene at any point to evaluate the robot’s generalization capabilities.

In their study, the researchers gave the robots a number of tasks ranging from simple to extremely challenging, based on three different curricula. The first involved no environment changes; the second had changes to a single variable; and the third allowed full randomization of all variables in the environment. They observed that as the curricula got more complex, the agents showed less ability to transfer their skills to the new conditions.

“If we continue scaling up training and network architectures beyond the experiments we report, current methods could potentially solve more of the block stacking environments we propose with CausalWorld,” points out Frederik Träuble, one of the contributors to the study. Träuble adds that “What’s actually interesting is that we humans can generalize much, much quicker [and] we don’t need such a vast amount of experience… We can learn from the underlying shared rules of [certain] environments…[and] use this to generalize better to yet other environments that we haven’t seen.”

A standard neural network, on the other hand, would require insane amounts of experience with myriad environments in order to do the same. “Having a model architecture or method that can learn these underlying rules or causal mechanisms, and utilize them could [help] overcome these challenges,” Träuble says.

CausalWorld’s evaluation protocols, say Ahmed and Träuble, are more versatile than those in previous studies because of the possibility of “disentangling” generalization abilities. In other words, users are free to intervene on a large number of variables in the environment, and thus draw systemic conclusions about what the agent generalizes to—or doesn’t. The next challenge, they say, is to actually use the tools available in CausalWorld to build more generalizable systems.

Despite how dazzled we are by AI’s ability to perform certain tasks, Yoshua Bengio, in 2019, estimated that present-day deep learning is less intelligent than a two-year-old child. Though the ability of neural networks to parallel-process on a large scale has given us breakthroughs in computer vision, translation, and memory, research is now shifting to developing novel deep architectures and training frameworks for addressing tasks like reasoning, planning, capturing causality, and obtaining systematic generalization. “I believe it’s just the beginning of a different style of brain-inspired computation,” Bengio said, adding, “I think we have a lot of the tools to get started.”

Improving customer experience and reducing cost with CodeGuru Profiler

Post Syndicated from Rajesh original https://aws.amazon.com/blogs/devops/improving-customer-experience-and-reducing-cost-with-codeguru-profiler/

Amazon CodeGuru is a set of developer tools powered by machine learning that provides intelligent recommendations for improving code quality and identifying an application’s most expensive lines of code. Amazon CodeGuru Profiler allows you to profile your applications in a low impact, always on manner. It helps you improve your application’s performance, reduce cost and diagnose application issues through rich data visualization and proactive recommendations. CodeGuru Profiler has been a very successful and widely used service within Amazon, before it was offered as a public service. This post discusses a few ways in which internal Amazon teams have used and benefited from continuous profiling of their production applications. These uses cases can provide you with better insights on how to reap similar benefits for your applications using CodeGuru Profiler.

Inside Amazon, over 100,000 applications currently use CodeGuru Profiler across various environments globally. Over the last few years, CodeGuru Profiler has served as an indispensable tool for resolving issues in the following three categories:

  1. Performance bottlenecks, high latency and CPU utilization
  2. Cost and Infrastructure utilization
  3. Diagnosis of an application impacting event

API latency improvement for CodeGuru Profiler

What could be a better example than CodeGuru Profiler using itself to improve its own performance?
CodeGuru Profiler offers an API called BatchGetFrameMetricData, which allows you to fetch time series data for a set of frames or methods. We noticed that the 99th percentile latency (i.e. the slowest 1 percent of requests over a 5 minute period) metric for this API was approximately 5 seconds, higher than what we wanted for our customers.

Solution

CodeGuru Profiler is built on a micro service architecture, with the BatchGetFrameMetricData API implemented as set of AWS Lambda functions. It also leverages other AWS services such as Amazon DynamoDB to store data and Amazon CloudWatch to record performance metrics.

When investigating the latency issue, the team found that the 5-second latency spikes were happening during certain time intervals rather than continuously, which made it difficult to easily reproduce and determine the root cause of the issue in pre-production environment. The new Lambda profiling feature in CodeGuru came in handy, and so the team decided to enable profiling for all its Lambda functions. The low impact, continuous profiling capability of CodeGuru Profiler allowed the team to capture comprehensive profiles over a period of time, including when the latency spikes occurred, enabling the team to better understand the issue.
After capturing the profiles, the team went through the flame graphs of one of the Lambda functions (TimeSeriesMetricsGeneratorLambda) and learned that all of its CPU time was spent by the thread responsible to publish metrics to CloudWatch. The following screenshot shows a flame graph during one of these spikes.

TimeSeriesMetricsGeneratorLambda taking 100% CPU

As seen, there is a single call stack visible in the above flame graph, indicating all the CPU time was taken by the thread invoking above code. This helped the team immediately understand what was happening. Above code was related to the thread responsible for publishing the CloudWatch metrics. This thread was publishing these metrics in a synchronized block and as this thread took most of the CPU, it caused all other threads to wait and the latency to spike. To fix the issue, the team simply changed the TimeSeriesMetricsGeneratorLambda Lambda code, to publish CloudWatch metrics at the end of the function, which eliminated contention of this thread with all other threads.

Improvement

After the fix was deployed, the 5 second latency spikes were gone, as seen in the following graph.

Latency reduction for BatchGetFrameMetricData API

Cost, infrastructure and other improvements for CAGE

CAGE is an internal Amazon retail service that does royalty aggregation for digital products, such as Kindle eBooks, MP3 songs and albums and more. Like many other Amazon services, CAGE is also customer of CodeGuru Profiler.

CAGE was experiencing latency delays and growing infrastructure cost, and wanted to reduce them. Thanks to CodeGuru Profiler’s always-on profiling capabilities, rich visualization and recommendations, the team was able to successfully diagnose the issues, determine the root cause and fix them.

Solution

With the help of CodeGuru Profiler, the CAGE team identified several reasons for their degraded service performance and increased hardware utilization:

  • Excessive garbage collection activity – The team reviewed the service flame graphs (see the following screenshot) and identified that a lot of CPU time was spent getting garbage collection activities, 65.07% of the total service CPU.

Excessive garbage collection activities for CAGE

  • Metadata overhead – The team followed CodeGuru Profiler recommendation to identify that the service’s DynamoDB responses were consuming higher CPU, 2.86% of total CPU time. This was due to the response metadata caching in the AWS SDK v1.x HTTP client that was turned on by default. This was causing higher CPU overhead for high throughput applications such as CAGE. The following screenshot shows the relevant recommendation.

Response metadata recommendation for CAGE

  • Excessive logging – The team also identified excessive logging of its internal Amazon ION structures. The team initially added this logging for debugging purposes, but was unaware of its impact on the CPU cost, taking 2.28% of the overall service CPU. The following screenshot is part of the flame graph that helped identify the logging impact.

Excessive logging in CAGE service

The team used these flame graphs and CodeGuru Profiler provided recommendations to determine the root cause of the issues and systematically resolve them by doing the following:

  • Switching to a more efficient garbage collector
  • Removing excessive logging
  • Disabling metadata caching for Dynamo DB response

Improvements

After making these changes, the team was able to reduce their infrastructure cost by 25%, saving close to $2600 per month. Service latency also improved, with a reduction in service’s 99th percentile latency from approximately 2,500 milliseconds to 250 milliseconds in their North America (NA) region as shown below.

CAGE Latency Reduction

The team also realized a side benefit of having reduced log verbosity and saw a reduction in log size by 55%.

Event Analysis of increased checkout latency for Amazon.com

During one of the high traffic times, Amazon retail customers experienced higher than normal latency on their checkout page. The issue was due to one of the downstream service’s API experiencing high latency and CPU utilization. While the team quickly mitigated the issue by increasing the service’s servers, the always-on CodeGuru Profiler came to the rescue to help diagnose and fix the issue permanently.

Solution

The team analyzed the flame graphs from CodeGuru Profiler at the time of the event and noticed excessive CPU consumption (69.47%) when logging exceptions using Log4j2. See the following screenshot taken from an earlier version of CodeGuru Profiler user interface.

Excessive CPU consumption when logging exceptions using Log4j2

With CodeGuru Profiler flame graph and other metrics, the team quickly confirmed that the issue was due to excessive exception logging using Log4j2. This downstream service had recently upgraded to Log4j2 version 2.8, in which exception logging could be expensive, due to the way Log4j2 handles class-loading of certain stack frames. Log4j 2.x versions enabled class loading by default, which was disabled in 1.x versions, causing the increased latency and CPU utilization. The team was not able to detect this issue in pre-production environment, as the impact was observable only in high traffic situations.

Improvement

After they understood the issue, the team successfully rolled out the fix, removing the unnecessary exception trace logging to fix the issue. Such performance issues and many others are proactively offered as CodeGuru Profiler recommendations, to ensure you can proactively learn about such issues with your applications and quickly resolve them.

Conclusion

I hope this post provided a glimpse into various ways CodeGuru Profiler can benefit your business and applications. To get started using CodeGuru Profiler, see Setting up CodeGuru Profiler.
For more information about CodeGuru Profiler, see the following:

Investigating performance issues with Amazon CodeGuru Profiler

Optimizing application performance with Amazon CodeGuru Profiler

Find Your Application’s Most Expensive Lines of Code and Improve Code Quality with Amazon CodeGuru

 

Use AI To Convert Ancient Maps Into Satellite-Like Images

Post Syndicated from Michelle Hampson original https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/ai-ancient-maps-satellite-images

Ancient maps give us a slight glimpse of how landscapes looked like centuries ago. But what would we see if we looked at these older maps with a modern lens?

Henrique Andrade is a student at Escola Politécnica da Universidade de Pernambuco, Recife who has been studying maps of his hometown Recife, in Brazil, for several years now. “I gathered all these digital copies of maps, and I ended up discovering things about my hometown that aren’t so widely known,” he says. “I feel that in Recife people were denied access to their own past, which makes it difficult for them to understand who they are, and consequently what they can do about their own future.”

Andrade approached a professor at his university, Bruno Fernandes, with an idea: to develop a machine learning algorithm that could transform old maps into Google satellite images. Such an approach, he believes, could inform people of how land use has changed over time, including the social and economic impacts of urbanization.

To see the project realized, they used an existing AI tool called Pix2pix, which relies on two neural networks. The first one creates images based on the input set, while the second network that decides if the generated image is fake or not. The networks are then trained to fool each other, and ultimately create realistic-looking images based on the historical data provided.

Andrade and Fernandes describe their approach in a study published 24 September 2020 in IEEE Geoscience and Remote Sensing Letters. In this study, they took a map of Recife from 1808 and generated modern day images of the area.

“When you look at the images, you get a better grasp of how the city has changed in 200 years,” explains Andrade. “The city’s geography has drastically changed—landfills have reduced the water bodies and green areas were all removed by human activity.”

He says an advantage of this AI approach is that it requires relatively little input volume; however, the input requires some historical context, and the resolution of the generated images is lower than what the researchers would like.

“Moving forward, we are working on improving the resolution of the images, and experimenting on different inputs,” says Andrade. He sees this approach to generate modern images of the past as widely applicable, noting that it could be applied to various locations and could be used by urban planners, anthropologists, and historians.