All posts by Patrick Gryczka

Image background removal using Amazon SageMaker semantic segmentation

Post Syndicated from Patrick Gryczka original https://aws.amazon.com/blogs/architecture/image-background-removal-using-amazon-sagemaker-semantic-segmentation/

Many individuals are creating their own ecommerce and online stores in order to sell their products and services. This simplifies and speeds the process of getting products out to your selected markets. This is a critical key indicator for the success of your business.

Artificial Intelligence/Machine Learning (AI/ML) and automation can offer you an improved and seamless process for image manipulation. You can take a picture identifying your products. You can then remove the background in order to publish high quality and clean product images. These images can be added to your online stores for consumers to view and purchase. This automated process will drastically decrease the manual effort required, though some manual quality review will be necessary. It will increase your time-to-market (TTM) and quickly get your products out to customers.

This blog post explains how you can automate the removal of image backgrounds by combining semantic segmentation inferences using Amazon SageMaker JumpStart. You can automate image processing using AWS Lambda. We will walk you through how you can set up an Amazon SageMaker JumpStart semantic segmentation inference endpoint using curated training data.

Amazon SageMaker JumpStart solution overview

Solution architecture for automatically processing new images and outputting isolated labels identified through semantic segmentation.

Figure 1. Architecture for automatically processing new images and outputting isolated labels identified through semantic segmentation.

The example architecture in Figure 1 shows a serverless architecture that uses SageMaker to perform semantic segmentation on images. Image processing takes place within a Lambda function, which extracts the identified (product) content from the background content in the image.

In this event driven architecture, Amazon Simple Storage Service (Amazon S3) invokes a Lambda function each time a new product image lands in the Uploaded Image Bucket. That Lambda function calls out to a semantic segmentation endpoint in Amazon SageMaker. The function then receives a segmentation mask that identifies the pixels that are part of the segment we are identifying. Then, the Lambda function processes the image to isolate the identified segment from the rest of the image, outputting the result to our Processed Image Bucket.

Semantic segmentation model

The semantic segmentation algorithm provides a fine-grained, pixel-level approach to developing computer vision applications. It tags every pixel in an image with a class label from a predefined set of classes. Because the semantic segmentation algorithm classifies every pixel in an image, it also provides information about the shapes of the objects contained in the image. The segmentation output is represented as a grayscale image, called a segmentation mask. A segmentation mask is a grayscale image with the same shape as the input image.

You can use the segmentation mask and replace the pixels that correspond to the class that is identified with the pixels from the original image. You can use the Python library PIL to do pixel manipulation on the image. The following images show how the image in Figure 1 will result in the image shown in Figure 2, when passed through semantic segmentation. When you use the Figure 2 mask and replace it with pixels from Figure 1, the end result is the image from Figure 3. Due to minor quality issues of the final image, you will need to do manual cleanup after automation.

Car image with background

Figure 2. Car image with background

Car mask image

Figure 3. Car mask image

Final image, background removed

Figure 4. Final image, background removed

SageMaker JumpStart streamlines the deployment of the prebuilt model on SageMaker, which supports the semantic segmentation algorithm. You can test this using the sample Jupyter notebook available at Extract Image using Semantic Segmentation, which demonstrates how to extract an individual form from the surrounding background.

Learn more about SageMaker JumpStart

SageMaker JumpStart is a quick way to learn about SageMaker features and capabilities through curated one-step solutions, example notebooks, and deployable pre-trained models. You can also fine-tune the models and then deploy them. You can access JumpStart using Amazon SageMaker Studio or programmatically through the SageMaker APIs.

SageMaker JumpStart provides lot of different semantic segmentation models that are pre-trained with class of objects it can identify. These models are fine-tuned for a sample dataset. You can tune the model with your dataset to get an effective mask for the class of object you want to retrieve from the image. When you fine-tune a model, you can use the default dataset or choose your own data, which is located in an Amazon S3 bucket. You can customize the hyperparameters of the training job that are used to fine-tune the model.

When the fine-tuning process is complete, JumpStart provides information about the model: parent model, training job name, training job Amazon Resource Name (ARN), training time, and output path. We retrieve the deploy_image_uri, deploy_source_uri, and base_model_uri for the pre-trained model. You can host the pre-trained base-model by creating an instance of sagemaker.model.Model and deploying it.

Conclusion

In this blog, we review the steps to use Amazon SageMaker JumpStart and AWS Lambda for automation and processing of images. It uses pre-trained machine learning models and inference. The solution ingests the product images, identifies your products, and then removes the image background. After some review and QA, you can then publish your products to your ecommerce store or other medium.

Further resources:

How fEMR Delivers Cryptographically Secure and Verifiable Medical Data with Amazon QLDB

Post Syndicated from Patrick Gryczka original https://aws.amazon.com/blogs/architecture/how-femr-delivers-cryptographically-secure-and-verifiable-emr-medical-data-with-amazon-qldb/

This post was co-written by Team fEMR’s President & Co-founder, Sarah Draugelis; CTO, Andy Mastie; Core Team Engineer & Fennel Labs Co-founder, Sean Batzel; Patrick Gryczka, AWS Solutions Architect; Mithil Prasad, AWS Senior Customer Solutions Manager.

Team fEMR is a non-profit organization that created a free, open-source Electronic Medical Records system for transient medical teams.  Their system has helped bring aid and drive data driven communication in disaster relief scenarios and low resource settings around the world since 2014.  In the past year, Team fEMR integrated Amazon Quantum Ledger Database (QLDB) as their HIPAA compliant database solution to address their need for data integrity and verifiability.

When delivering aid to at risk communities and within challenging social and political environments, data veracity is a critical issue.  Patients need to trust that their data is confidential and following an appropriate chain of ownership.  Aid organizations meanwhile need to trust the demographic data provided to them to appropriately understand the scope of disasters and verify the usage of funds. Amazon QLDB is backed by an immutable append-only journal.  This journal is verifiable, making it easier for Team fEMR to engage in external audits and deliver a trusted and verifiable EMR solution. The teams that use the new fEMR system are typically working or volunteering in post-disaster environments, refugee camps, or in other mobile clinics that offer pre-hospital care.

In this blog post, we explore how Team fEMR leveraged Amazon QLDB and other AWS Managed Services to enable their relief efforts.

Background

Before the use of an electronic record keeping system, these teams often had to use paper records, which would easily be lost or destroyed. The new fEMR system allows the clinician to look up the patient’s history of illness, in order to provide the patient with a seamless continuity of care between clinic visits. Additionally, the collection of health data on a more macroscopic level allows researchers and data scientists to monitor for disease. This is- an especially important aspect of mobile medicine in a pandemic world.

The fEMR system has been deployed worldwide since 2014. In the original design, the system functioned solely in an on-premises environment. Clinicians were able to attach their own devices to the system, and have access to the EMR functionality and medical records. While the need for a standalone solution continues to exist for environments without data coverage and connectivity, demand for fEMR has increased rapidly and outpaced deployment capabilities as well as hardware availability. To solve for real-time deployment and scalability needs, Team fEMR migrated fEMR to the cloud and developed a HIPAA-compliant and secure architecture using Amazon QLDB.

As part of their cloud adoption strategy, Team fEMR decided to procure more managed services, to automate operational tasks and to optimize their resources.

Architecture showing how How fEMR Delivers Cryptographically Secure and Verifiable EMR Medical Data with Amazon QLDB

Figure 1 – Architecture showing How fEMR Delivers Cryptographically Secure and Verifiable EMR Medical Data with Amazon QLDB

The team built the preceding architecture using a combination of the following AWS managed services:

1. Amazon Quantum Ledger Database (QLDB) – By using QLDB, Team fEMR were able to build an unfalsifiable, HIPAA compliant, and cryptographically verifiable record of medical histories, as well as for clinic pharmacy usage and stocking.

2. AWS Elastic Beanstalk – Team fEMR uses Elastic Beanstalk to deploy and run their Django front end and application logic.  It allows their agile development team to focus on development and feature delivery, by offloading the operational work of deployment, patching, and scaling.

3. Amazon Relational Database Service (RDS) – Team fEMR uses RDS for ad-hoc search, reporting, and analytics, whereas QLDB is not optimized for the specific requirement.

4. Amazon ElastiCache – Team fEMR uses Amazon ElastiCache to cache user session data to provide near real-time response times.

Data Security considerations

Data Security was a key consideration when building the fEMR EHR solution. End users of the solution are often working in environments with at-risk populations. That is, patients who may be fleeing persecution from their home country, or may be at risk due to discriminatory treatment. It is therefore imperative to secure their data. QLDB as a data storage mechanism provides a cryptographically secure history of all changes to data. This has the benefit of improved visibility and auditability and is invaluable in situations where medical data needs to be as tamper-evident as possible.

Using Auto Scaling to minimize operational effort

When Team fEMR engages with disaster relief, they deal with ambiguity around both when events may occur and at what scale their effects may be felt.  By leveraging managed services like QLDB, RDS, and Elastic Beanstalk, Team fEMR was able to minimize the time their technical team spent on systems operations.  Instead, they can focus optimizing and improving their technology architectures.

Use of Infrastructure as Code to enable fast global deployment

With Infrastructure as Code, Team fEMR was able to create repeatable deployments. They utilized AWS CloudFormation to deploy their Elasticache, RDS, QLDB, and Elastic Beanstalk environment.  Elastic Beanstalk was used to further automate the deployment of infrastructure for their Django stack. Repeatability of deployment enables the team to have the flexibility they need to deploy in certain regions due to geographic and data sovereignty requirements.

Optimizing the architecture

The team found that simultaneous writing into two databases could cause inconsistencies if a write succeeds in one database and fails in the other. In addition, it puts the burden of identifying errors and rolling back updates on the application. Therefore, an improvement planned for this architecture is to stream successful transactions from Amazon QLDB to Amazon RDS using Amazon Kinesis Data Streams. This service provides them a way to replicate data from Amazon QLDB to Amazon RDS, and any other databases or dashboards. QLDB remains as as their System of Record.

Figure 2 - Optimized Architecture using Amazon Kinesis Data Streams

Figure 2 – Optimized Architecture using Amazon Kinesis Data Streams

Conclusion

As a result of migrating their system to the cloud, Team fEMR was able to deliver their EMR system with less operational overhead and instead focus on bringing their solution to the communities that need it.  By using Amazon QLDB, Team fEMR was able to make their solution easier to audit and enabled more trust in their work with at-risk populations.

To learn more about Team fEMR, you can read about their efforts on their organization’s website, and explore their Open Source contributions on GitHub.

For hands on experience with Amazon QLDB you can reference our QLDB Workshops and explore our QLDB Developer Guide.