Since our last System and Organization Control (SOC) audit, our service and compliance teams have been working to increase the number of AWS Services in scope prioritized based on customer requests. Today, we’re happy to report 11 services are newly SOC compliant, which is a 21 percent increase in the last six months.
Our latest SOC 1, 2, and 3 reports covering the period from October 1, 2017 to March 31, 2018 are now available. The SOC 1 and 2 reports are available on-demand through AWS Artifact by logging into the AWS Management Console. The SOC 3 report can be downloaded here.
Finally, prospective customers can read our SOC 1 and 2 reports by reaching out to AWS Compliance.
Want more AWS Security news? Follow us on Twitter.
This blog post was co-authored by Ujjwal Ratan, a senior AI/ML solutions architect on the global life sciences team.
Healthcare data is generated at an ever-increasing rate and is predicted to reach 35 zettabytes by 2020. Being able to cost-effectively and securely manage this data whether for patient care, research or legal reasons is increasingly important for healthcare providers.
Healthcare providers must have the ability to ingest, store and protect large volumes of data including clinical, genomic, device, financial, supply chain, and claims. AWS is well-suited to this data deluge with a wide variety of ingestion, storage and security services (e.g. AWS Direct Connect, Amazon Kinesis Streams, Amazon S3, Amazon Macie) for customers to handle their healthcare data. In a recent Healthcare IT News article, healthcare thought-leader, John Halamka, noted, “I predict that five years from now none of us will have datacenters. We’re going to go out to the cloud to find EHRs, clinical decision support, analytics.”
I realize simply storing this data is challenging enough. Magnifying the problem is the fact that healthcare data is increasingly attractive to cyber attackers, making security a top priority. According to Mariya Yao in her Forbes column, it is estimated that individual medical records can be worth hundreds or even thousands of dollars on the black market.
In this first of a 2-part post, I will address the value that AWS can bring to customers for ingesting, storing and protecting provider’s healthcare data. I will describe key components of any cloud-based healthcare workload and the services AWS provides to meet these requirements. In part 2 of this post we will dive deep into the AWS services used for advanced analytics, artificial intelligence and machine learning.
The data tsunami is upon us
So where is this data coming from? In addition to the ubiquitous electronic health record (EHR), the sources of this data include:
devices such as MRIs, x-rays and ultrasounds
sensors and wearables for patients
medical equipment telemetry
Additional sources of data come from non-clinical, operational systems such as:
claims and billing
Data from these sources can be structured (e.g., claims data) as well as unstructured (e.g., clinician notes). Some data comes across in streams such as that taken from patient monitors, while some comes in batch form. Still other data comes in near-real time such as HL7 messages. All of this data has retention policies dictating how long it must be stored. Much of this data is stored in perpetuity as many systems in use today have no purge mechanism. AWS has services to manage all these data types as well as their retention, security and access policies.
Imaging is a significant contributor to this data tsunami. Increasing demand for early-stage diagnoses along with aging populations drive increasing demand for images from CT, PET, MRI, ultrasound, digital pathology, X-ray and fluoroscopy. For example, a thin-slice CT image can be hundreds of megabytes. Increasing demand and strict retention policies make storage costly.
Due to the plummeting cost of gene sequencing, molecular diagnostics (including liquid biopsy) is a large contributor to this data deluge. Many predict that as the value of molecular testing becomes more identifiable, the reimbursement models will change and it will increasingly become the standard of care. According to the Washington Post article “Sequencing the Genome Creates so Much Data We Don’t Know What to do with It,”
“Some researchers predict that up to one billion people will have their genome sequenced by 2025 generating up to 40 exabytes of data per year.”
Although genomics is primarily used for oncology diagnostics today, it’s also used for other purposes, pharmacogenomics — used to understand how an individual will metabolize a medication.
It is increasingly challenging for the typical hospital, clinic or physician practice to securely store, process and manage this data without cloud adoption.
Amazon has a variety of ingestion techniques depending on the nature of the data including size, frequency and structure. AWS Snowball and AWS Snowmachine are appropriate for extremely-large, secure data transfers whether one time or episodic. AWS Glue is a fully-managed ETL service for securely moving data from on-premise to AWS and Amazon Kinesis can be used for ingesting streaming data.
Amazon S3, Amazon S3 IA, and Amazon Glacier are economical, data-storage services with a pay-as-you-go pricing model that expand (or shrink) with the customer’s requirements.
The above architecture has four distinct components – ingestion, storage, security, and analytics. In this post I will dive deeper into the first three components, namely ingestion, storage and security. In part 2, I will look at how to use AWS’ analytics services to draw value on, and optimize, your healthcare data.
A typical provider data center will consist of many systems with varied datasets. AWS provides multiple tools and services to effectively and securely connect to these data sources and ingest data in various formats. The customers can choose from a range of services and use them in accordance with the use case.
For use cases involving one-time (or periodic), very large data migrations into AWS, customers can take advantage of AWS Snowball devices. These devices come in two sizes, 50 TB and 80 TB and can be combined together to create a petabyte scale data transfer solution.
The devices are easy to connect and load and they are shipped to AWS avoiding the network bottlenecks associated with such large-scale data migrations. The devices are extremely secure supporting 256-bit encryption and come in a tamper-resistant enclosure. AWS Snowball imports data in Amazon S3 which can then interface with other AWS compute services to process that data in a scalable manner.
For use cases involving a need to store a portion of datasets on premises for active use and offload the rest on AWS, the Amazon storage gateway service can be used. The service allows you to seamlessly integrate on premises applications via standard storage protocols like iSCSI or NFS mounted on a gateway appliance. It supports a file interface, a volume interface and a tape interface which can be utilized for a range of use cases like disaster recovery, backup and archiving, cloud bursting, storage tiering and migration.
By using the AWS proposed reference architecture for disaster recovery, healthcare providers can ensure their data assets are securely stored on the cloud and are easily accessible in the event of a disaster. The “AWS Disaster Recovery” whitepaper includes details on options available to customers based on their desired recovery time objective (RTO) and recovery point objective (RPO).
AWS is an ideal destination for offloading large volumes of less-frequently-accessed data. These datasets are rarely used in active compute operations but are exceedingly important to retain for reasons like compliance. By storing these datasets on AWS, customers can take advantage of the highly-durable platform to securely store their data and also retrieve them easily when they need to. For more details on how AWS enables customers to run back and archival use cases on AWS, please refer to the following set of whitepapers.
A healthcare provider may have a variety of databases spread throughout the hospital system supporting critical applications such as EHR, PACS, finance and many more. These datasets often need to be aggregated to derive information and calculate metrics to optimize business processes. AWS Glue is a fully-managed Extract, Transform and Load (ETL) service that can read data from a JDBC-enabled, on-premise database and transfer the datasets into AWS services like Amazon S3, Amazon Redshift and Amazon RDS. This allows customers to create transformation workflows that integrate smaller datasets from multiple sources and aggregates them on AWS.
Healthcare providers deal with a variety of streaming datasets which often have to be analyzed in near real time. These datasets come from a variety of sources such as sensors, messaging buses and social media, and often do not adhere to an industry standard. The Amazon Kinesis suite of services, that includes Amazon Kinesis Streams, Amazon Kinesis Firehose, and Amazon Kinesis Analytics, are the ideal set of services to accomplish the task of deriving value from streaming data.
Example: Using AWS Glue to de-identify and ingest healthcare data into S3 Let’s consider a scenario in which a provider maintains patient records in a database they want to ingest into S3. The provider also wants to de-identify the data by stripping personally- identifiable attributes and store the non-identifiable information in an S3 bucket. This bucket is different from the one that contains identifiable information. Doing this allows the healthcare provider to separate sensitive information with more restrictions set up via S3 bucket policies.
To ingest records into S3, we create a Glue job that reads from the source database using a Glue connection. The connection is also used by a Glue crawler to populate the Glue data catalog with the schema of the source database. We will use the Glue development endpoint and a zeppelin notebook server on EC2 to develop and execute the job.
Step 1: Import the necessary libraries and also set a glue context which is a wrapper on the spark context:
Step 2: Create a dataframe from the source data. I call the dataframe “readmissionsdata”. Here is what the schema would look like:
Step 3: Now select the columns that contains indentifiable information and store it in a new dataframe. Call the new dataframe “phi”.
Step 4: Non-PHI columns are stored in a separate dataframe. Call this dataframe “nonphi”.
Step 5: Write the two dataframes into two separate S3 buckets
Once successfully executed, the PHI and non-PHI attributes are stored in two separate files in two separate buckets that can be individually maintained.
In 2016, 327 healthcare providers reported a protected health information (PHI) breach, affecting 16.4m patient records. There have been 342 data breaches reported in 2017 — involving 3.2 million patient records.
To date, AWS has released 51 HIPAA-eligible services to help customers address security challenges and is in the process of making many more services HIPAA-eligible. These HIPAA-eligible services (along with all other AWS services) help customers build solutions that comply with HIPAA security and auditing requirements. A catalogue of HIPAA-enabled services can be found at AWS HIPAA-eligible services. It is important to note that AWS manages physical and logical access controls for the AWS boundary. However, the overall security of your workloads is a shared responsibility, where you are responsible for controlling user access to content on your AWS accounts.
AWS storage services allow you to store data efficiently while maintaining high durability and scalability. By using Amazon S3 as the central storage layer, you can take advantage of the Amazon S3 storage management features to get operational metrics on your data sets and transition them between various storage classes to save costs. By tagging objects on Amazon S3, you can build a governance layer on Amazon S3 to grant role based access to objects using Amazon IAM and Amazon S3 bucket policies.
To learn more about the Amazon S3 storage management features, see the following link.
In the example above, we are storing the PHI information in a bucket named “phi.” Now, we want to protect this information to make sure its encrypted, does not have unauthorized access, and all access requests to the data are logged.
Encryption: S3 provides settings to enable default encryption on a bucket. This ensures any object in the bucket is encrypted by default.
Logging: S3 provides object level logging that can be used to capture all API calls to the object. The API calls are logged in cloudtrail for easy access and consolidation. Moreover, it also supports events to proactively alert customers of read and write operations.
Access control: Customers can use S3 bucket policies and IAM policies to restrict access to the phi bucket. It can also put a restriction to enforce multi-factor authentication on the bucket. For example, the following policy enforces multi-factor authentication on the phi bucket:
In Part 1 of this blog, we detailed the ingestion, storage, security and management of healthcare data on AWS. Stay tuned for part two where we are going to dive deep into optimizing the data for analytics and machine learning.
AWS has updated its certifications against ISO 9001, ISO 27001, ISO 27017, and ISO 27018 standards, bringing the total to 67 services now under ISO compliance. We added the following 29 services this cycle:
AWS maintains certifications through extensive audits of its controls to ensure that information security risks that affect the confidentiality, integrity, and availability of company and customer information are appropriately managed.
You can download copies of the AWS ISO certificates that contain AWS’s in-scope services and Regions, and use these certificates to jump-start your own certification efforts:
The Paris Region will benefit from three AWS Direct Connect locations. Telehouse Voltaire is available today. AWS Direct Connect will also become available at Equinix Paris in early 2018, followed by Interxion Paris.
All AWS infrastructure regions around the world are designed, built, and regularly audited to meet the most rigorous compliance standards and to provide high levels of security for all AWS customers. These include ISO 27001, ISO 27017, ISO 27018, SOC 1 (Formerly SAS 70), SOC 2 and SOC 3 Security & Availability, PCI DSS Level 1, and many more. This means customers benefit from all the best practices of AWS policies, architecture, and operational processes built to satisfy the needs of even the most security sensitive customers.
AWS is certified under the EU-US Privacy Shield, and the AWS Data Processing Addendum (DPA) is GDPR-ready and available now to all AWS customers to help them prepare for May 25, 2018 when the GDPR becomes enforceable. The current AWS DPA, as well as the AWS GDPR DPA, allows customers to transfer personal data to countries outside the European Economic Area (EEA) in compliance with European Union (EU) data protection laws. AWS also adheres to the Cloud Infrastructure Service Providers in Europe (CISPE) Code of Conduct. The CISPE Code of Conduct helps customers ensure that AWS is using appropriate data protection standards to protect their data, consistent with the GDPR. In addition, AWS offers a wide range of services and features to help customers meet the requirements of the GDPR, including services for access controls, monitoring, logging, and encryption.
From Our Customers Many AWS customers are preparing to use this new Region. Here’s a small sample:
Societe Generale, one of the largest banks in France and the world, has accelerated their digital transformation while working with AWS. They developed SG Research, an application that makes reports from Societe Generale’s analysts available to corporate customers in order to improve the decision-making process for investments. The new AWS Region will reduce latency between applications running in the cloud and in their French data centers.
SNCF is the national railway company of France. Their mobile app, powered by AWS, delivers real-time traffic information to 14 million riders. Extreme weather, traffic events, holidays, and engineering works can cause usage to peak at hundreds of thousands of users per second. They are planning to use machine learning and big data to add predictive features to the app.
Radio France, the French public radio broadcaster, offers seven national networks, and uses AWS to accelerate its innovation and stay competitive.
Les Restos du Coeur, a French charity that provides assistance to the needy, delivering food packages and participating in their social and economic integration back into French society. Les Restos du Coeur is using AWS for its CRM system to track the assistance given to each of their beneficiaries and the impact this is having on their lives.
AlloResto by JustEat (a leader in the French FoodTech industry), is using AWS to to scale during traffic peaks and to accelerate their innovation process.
AWS Consulting and Technology Partners We are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in France. Here’s a partial list:
AWS in France We have been investing in Europe, with a focus on France, for the last 11 years. We have also been developing documentation and training programs to help our customers to improve their skills and to accelerate their journey to the AWS Cloud.
As part of our commitment to AWS customers in France, we plan to train more than 25,000 people in the coming years, helping them develop highly sought after cloud skills. They will have access to AWS training resources in France via AWS Academy, AWSome days, AWS Educate, and webinars, all delivered in French by AWS Technical Trainers and AWS Certified Trainers.
Use it Today The EU (Paris) Region is open for business now and you can start using it today!
Our Health Customer Stories page lists just a few of the many customers that are building and running healthcare and life sciences applications that run on AWS. Customers like Verge Health, Care Cloud, and Orion Health trust AWS with Protected Health Information (PHI) and Personally Identifying Information (PII) as part of their efforts to comply with HIPAA and HITECH.
Sixteen More Services In my last HIPAA Eligibility Update I shared the news that we added eight additional services to our list of HIPAA eligible services. Today I am happy to let you know that we have added another sixteen services to the list, bringing the total up to 46. Here are the newest additions, along with some short descriptions and links to some of my blog posts to jog your memory:
Last week we launched our 15th AWS Region and today we are launching our 16th. We have expanded the AWS footprint into the United Kingdom with a new Region in London, our third in Europe. AWS customers can use the new London Region to better serve end-users in the United Kingdom and can also use it to store data in the UK.
From Our Customers Many AWS customers are getting ready to use this new Region. Here’s a very small sample:
Trainline is Europe’s number one independent rail ticket retailer. Every day more than 100,000 people travel using tickets bought from Trainline. Here’s what Mark Holt (CTO of Trainline) shared with us:
We recently completed the migration of 100 percent of our eCommerce infrastructure to AWS and have seen awesome results: improved security, 60 percent less downtime, significant cost savings and incredible improvements in agility. From extensive testing, we know that 0.3s of latency is worth more than 8 million pounds and so, while AWS connectivity is already blazingly fast, we expect that serving our UK customers from UK datacenters should lead to significant top-line benefits.
Kainos Evolve Electronic Medical Records (EMR) automates the creation, capture and handling of medical case notes and operational documents and records, allowing healthcare providers to deliver better patient safety and quality of care for several leading NHS Foundation Trusts and market leading healthcare technology companies.
Travis Perkins, the largest supplier of building materials in the UK, is implementing the biggest systems and business change in its history including the migration of its datacenters to AWS.
Just Eat is the world’s leading marketplace for online food delivery. Using AWS, JustEat has been able to experiment faster and reduce the time to roll out new feature updates.
OakNorth, a new bank focused on lending between £1m-£20m to entrepreneurs and growth businesses, became the UK’s first cloud-based bank in May after several months of working with AWS to drive the development forward with the regulator.
Partners I’m happy to report that we are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in the United Kingdom. Here’s a partial list:
AWS Premier Consulting Partners – Accenture, Claranet, Cloudreach, CSC, Datapipe, KCOM, Rackspace, and Slalom.
AWS Consulting Partners – Attenda, Contino, Deloitte, KPMG, LayerV, Lemongrass, Perfect Image, and Version 1.
AWS Managed Service Partners – Claranet, Cloudreach, KCOM, and Rackspace.
AWS Direct Connect Partners – AT&T, BT, Hutchison Global Communications, Level 3, Redcentric, and Vodafone.
Here are a few examples of what our partners are working on:
KCOM is a professional services provider offering consultancy, architecture, project delivery and managed service capabilities to large UK-based enterprise businesses. The scalability and flexibility of AWS gives them a significant competitive advantage with their enterprise and public sector customers. The new Region will allow KCOM to build innovative solutions for their public sector clients while meeting local regulatory requirements.
Splunk is a member of the AWS Partner Network and a market leader in analyzing machine data to deliver operational intelligence for security, IT, and the business. They use cloud computing and big data analytics to help their customers to embrace digital transformation and continuous innovation. The new Region will provide even more companies with real-time visibility into the operation of their systems and infrastructure.
Redcentric is a NHS Digital-approved N3 Commercial Aggregator. Their work allows health and care providers such as NHS acute, emergency and mental trusts, clinical commissioning groups (CCGs), and the ISV community to connect securely to AWS. The London Region will allow health and care providers to deliver new digital services and to improve outcomes for citizens and patients.
Compliance & Connectivity Every AWS Region is designed and built to meet rigorous compliance standards including ISO 27001, ISO 9001, ISO 27017, ISO 27018, SOC 1, SOC 2, SOC3, PCI DSS Level 1, and many more. Our Cloud Compliance page includes information about these standards, along with those that are specific to the UK, including Cyber Essentials Plus.
The UK Government recognizes that local datacenters from hyper scale public cloud providers can deliver secure solutions for OFFICIAL workloads. In order to meet the special security needs of public sector organizations in the UK with respect to OFFICIAL workloads, we have worked with our Direct Connect Partners to make sure that obligations for connectivity to the Public Services Network (PSN) and N3 can be met.
Use it Today The London Region is open for business now and you can start using it today! If you need additional information about this Region, please feel free to contact our UK team at [email protected].
Moving large amounts of on-premises data to the cloud as part of a migration effort is still more challenging than it should be! Even with high-end connections, moving petabytes or exabytes of film vaults, financial records, satellite imagery, or scientific data across the Internet can take years or decades. On the business side, adding new networking or better connectivity to data centers that are scheduled to be decommissioned after a migration is expensive and hard to justify.
However, customers with exabyte-scale on-premises storage look at the 80 TB, do the math, and realize that an all-out data migration would still require lots of devices and some headache-inducing logistics.
Introducing AWS Snowmobile In order to meet the needs of these customers, we are launching Snowmobile today. This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks (you can get more than one if necessary). Designed to meet the needs of our customers in the financial services, media & entertainment, scientific, and other industries, Snowmobile attaches to your network and appears as a local, NFS-mounted volume. You can use your existing backup and archiving tools to fill it up with data destined for Amazon Simple Storage Service (S3) or Amazon Glacier.
Physically, Snowmobile is a ruggedized, tamper-resistant shipping container 45 feet long, 9.6 feet high, and 8 feet wide. It is water-proof, climate-controlled, and can be parked in a covered or uncovered area adjacent to your existing data center. Each Snowmobile consumes about 350 KW of AC power; if you don’t have sufficient capacity on site we can arrange for a generator.
On the security side, Snowmobile incorporates multiple layers of logical and physical protection including chain-of-custody tracking and video surveillance. Your data is encrypted with your AWS Key Management Service (KMS) keys before it is written. Each container includes GPS tracking, with cellular or satellite connectivity back to AWS. We will arrange for a security vehicle escort when the Snowmobile is in transit; we can also arrange for dedicated security guards while your Snowmobile is on-premises.
Each Snowmobile includes a network cable connected to a high-speed switch capable of supporting 1 Tb/second of data transfer spread across multiple 40 Gb/second connections. Assuming that your existing network can transfer data at that rate, you can fill a Snowmobile in about 10 days.
Snowmobile in Action I don’t happen to have an exabyte-scale data center and I certainly don’t have room next to my house for a 45 foot long container. In order to illustrate the process of arranging for and using a Snowmobile, I sat down at my LEGO table and (in the finest Doc Brown tradition) built a scale model. I hope that you enjoy this brick-based story telling!
Let’s start in your data center. It was built a while ago and is definitely showing its age. The racks are full of disk and tape drives of multiple vintages, each storing precious, mission-critical data. You and your colleagues spend too much time inside of the raised floor, tracking cables and trying to squeeze out just a bit more performance:
Your manager is getting frustrated and does not know what to do next:
Fortunately, one of your colleagues reads this blog every day and she knows just what to do:
A quick phone call to AWS and a meeting is set up:
Everyone gets together at a convenient AWS office to learn more about Snowmobile and to plan the migration:
Everyone gathers around to look at the scale model of the Snowmobile. Even the dog is intrigued, and your manager takes a picture:
A Snowmobile shows up at your data center:
AWS Professional Services helps you to get it connected and you initiate the data transfer:
The Snowmobile heads back to AWS and your data is imported as you specified!
Snowmobile at DigitalGlobe Our friends at DigitalGlobe are using a Snowmobile to move 100 PB of satellite imagery to AWS. Here’s what Jay Littlepage (former Amazonian and now VP of Infrastructure & Operations at DigitalGlobe) has to say about this effort:
Like many large enterprises, we are in the process of migrating IT operations from our data centers to AWS. Our geospatial big data platform, GBDX, has been based in AWS since inception. But our unmatchable 16-year archive of high-resolution satellite imagery, visualizing 6 billion square kilometers of the Earth’s surface, has been stored within our facilities. We have slowly been migrating our archive to AWS but that process has been slow and inefficient. Our constellation of satellites generate more earth imagery each year (10 PB) than we have been able to migrate by these methods.
We needed a solution that could move our 100 PB archive but could not find one until now with AWS Snowmobile. DigitalGlobe is currently migrating our entire raw imagery archive with one Snowmobile transfer directly into an Amazon Glacier Vault. AWS Snowmobile operators are providing an amazing customized service where they manage the configuration, monitoring, and logistics. Using Snowmobile’s data transfer abilities will get our time-lapse imagery archive to the cloud more quickly, allowing our customers and partners to have access to uniquely massive data sets. By using AWS’ elastic computing platform within GBDX, we will run distributed image analysis, revealing the pace and pattern of world-wide change on an extraordinary scale, with unprecedented speed, in a more cost-effective manner – prioritizing insights over infrastructure. Without Snowmobile, we would not have been able to transfer our extremely large volume of data in such a short time or create new business opportunities for our customers. Snowmobile is truly a game changer!
Things to Know Here are a couple of final things you should know about Snowmobile:
Data Export – The initial launch is aimed at data import (on-premises to AWS). We do know that some of our customers are interested in data export, with a particular focus on disaster recovery (DR) use cases.
Availability – Snowmobile is available in all AWS Regions. As you can see from reading the previous section, this is not a self-serve product. My AWS Sales colleagues are ready to discuss your data import needs with you.
Pricing – I don’t have pricing info to share. However, we intend to make sure that Snowmobile is both faster and less expensive than using a network-based data transfer model.
PPS – I will build and personally deliver (in exchange for a photo op and a bloggable story) Snowmobile models to the first 5 customers.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.