Tag Archives: hospital

Ransomware Update: Viruses Targeting Business IT Servers

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/ransomware-update-viruses-targeting-business-it-servers/

Ransomware warning message on computer

As ransomware attacks have grown in number in recent months, the tactics and attack vectors also have evolved. While the primary method of attack used to be to target individual computer users within organizations with phishing emails and infected attachments, we’re increasingly seeing attacks that target weaknesses in businesses’ IT infrastructure.

How Ransomware Attacks Typically Work

In our previous posts on ransomware, we described the common vehicles used by hackers to infect organizations with ransomware viruses. Most often, downloaders distribute trojan horses through malicious downloads and spam emails. The emails contain a variety of file attachments, which if opened, will download and run one of the many ransomware variants. Once a user’s computer is infected with a malicious downloader, it will retrieve additional malware, which frequently includes crypto-ransomware. After the files have been encrypted, a ransom payment is demanded of the victim in order to decrypt the files.

What’s Changed With the Latest Ransomware Attacks?

In 2016, a customized ransomware strain called SamSam began attacking the servers in primarily health care institutions. SamSam, unlike more conventional ransomware, is not delivered through downloads or phishing emails. Instead, the attackers behind SamSam use tools to identify unpatched servers running Red Hat’s JBoss enterprise products. Once the attackers have successfully gained entry into one of these servers by exploiting vulnerabilities in JBoss, they use other freely available tools and scripts to collect credentials and gather information on networked computers. Then they deploy their ransomware to encrypt files on these systems before demanding a ransom. Gaining entry to an organization through its IT center rather than its endpoints makes this approach scalable and especially unsettling.

SamSam’s methodology is to scour the Internet searching for accessible and vulnerable JBoss application servers, especially ones used by hospitals. It’s not unlike a burglar rattling doorknobs in a neighborhood to find unlocked homes. When SamSam finds an unlocked home (unpatched server), the software infiltrates the system. It is then free to spread across the company’s network by stealing passwords. As it transverses the network and systems, it encrypts files, preventing access until the victims pay the hackers a ransom, typically between $10,000 and $15,000. The low ransom amount has encouraged some victimized organizations to pay the ransom rather than incur the downtime required to wipe and reinitialize their IT systems.

The success of SamSam is due to its effectiveness rather than its sophistication. SamSam can enter and transverse a network without human intervention. Some organizations are learning too late that securing internet-facing services in their data center from attack is just as important as securing endpoints.

The typical steps in a SamSam ransomware attack are:

1
Attackers gain access to vulnerable server
Attackers exploit vulnerable software or weak/stolen credentials.
2
Attack spreads via remote access tools
Attackers harvest credentials, create SOCKS proxies to tunnel traffic, and abuse RDP to install SamSam on more computers in the network.
3
Ransomware payload deployed
Attackers run batch scripts to execute ransomware on compromised machines.
4
Ransomware demand delivered requiring payment to decrypt files
Demand amounts vary from victim to victim. Relatively low ransom amounts appear to be designed to encourage quick payment decisions.

What all the organizations successfully exploited by SamSam have in common is that they were running unpatched servers that made them vulnerable to SamSam. Some organizations had their endpoints and servers backed up, while others did not. Some of those without backups they could use to recover their systems chose to pay the ransom money.

Timeline of SamSam History and Exploits

Since its appearance in 2016, SamSam has been in the news with many successful incursions into healthcare, business, and government institutions.

March 2016
SamSam appears

SamSam campaign targets vulnerable JBoss servers
Attackers hone in on healthcare organizations specifically, as they’re more likely to have unpatched JBoss machines.

April 2016
SamSam finds new targets

SamSam begins targeting schools and government.
After initial success targeting healthcare, attackers branch out to other sectors.

April 2017
New tactics include RDP

Attackers shift to targeting organizations with exposed RDP connections, and maintain focus on healthcare.
An attack on Erie County Medical Center costs the hospital $10 million over three months of recovery.
Erie County Medical Center attacked by SamSam ransomware virus

January 2018
Municipalities attacked

• Attack on Municipality of Farmington, NM.
• Attack on Hancock Health.
Hancock Regional Hospital notice following SamSam attack
• Attack on Adams Memorial Hospital
• Attack on Allscripts (Electronic Health Records), which includes 180,000 physicians, 2,500 hospitals, and 7.2 million patients’ health records.

February 2018
Attack volume increases

• Attack on Davidson County, NC.
• Attack on Colorado Department of Transportation.
SamSam virus notification

March 2018
SamSam shuts down Atlanta

• Second attack on Colorado Department of Transportation.
• City of Atlanta suffers a devastating attack by SamSam.
The attack has far-reaching impacts — crippling the court system, keeping residents from paying their water bills, limiting vital communications like sewer infrastructure requests, and pushing the Atlanta Police Department to file paper reports.
Atlanta Ransomware outage alert
• SamSam campaign nets $325,000 in 4 weeks.
Infections spike as attackers launch new campaigns. Healthcare and government organizations are once again the primary targets.

How to Defend Against SamSam and Other Ransomware Attacks

The best way to respond to a ransomware attack is to avoid having one in the first place. If you are attacked, making sure your valuable data is backed up and unreachable by ransomware infection will ensure that your downtime and data loss will be minimal or none if you ever suffer an attack.

In our previous post, How to Recover From Ransomware, we listed the ten ways to protect your organization from ransomware.

  1. Use anti-virus and anti-malware software or other security policies to block known payloads from launching.
  2. Make frequent, comprehensive backups of all important files and isolate them from local and open networks. Cybersecurity professionals view data backup and recovery (74% in a recent survey) by far as the most effective solution to respond to a successful ransomware attack.
  3. Keep offline backups of data stored in locations inaccessible from any potentially infected computer, such as disconnected external storage drives or the cloud, which prevents them from being accessed by the ransomware.
  4. Install the latest security updates issued by software vendors of your OS and applications. Remember to patch early and patch often to close known vulnerabilities in operating systems, server software, browsers, and web plugins.
  5. Consider deploying security software to protect endpoints, email servers, and network systems from infection.
  6. Exercise cyber hygiene, such as using caution when opening email attachments and links.
  7. Segment your networks to keep critical computers isolated and to prevent the spread of malware in case of attack. Turn off unneeded network shares.
  8. Turn off admin rights for users who don’t require them. Give users the lowest system permissions they need to do their work.
  9. Restrict write permissions on file servers as much as possible.
  10. Educate yourself, your employees, and your family in best practices to keep malware out of your systems. Update everyone on the latest email phishing scams and human engineering aimed at turning victims into abettors.

Please Tell Us About Your Experiences with Ransomware

Have you endured a ransomware attack or have a strategy to avoid becoming a victim? Please tell us of your experiences in the comments.

The post Ransomware Update: Viruses Targeting Business IT Servers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Raspberry Jam Big Birthday Weekend 2018 roundup

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/big-birthday-weekend-2018-roundup/

A couple of weekends ago, we celebrated our sixth birthday by coordinating more than 100 simultaneous Raspberry Jam events around the world. The Big Birthday Weekend was a huge success: our fantastic community organised Jams in 40 countries, covering six continents!

We sent the Jams special birthday kits to help them celebrate in style, and a video message featuring a thank you from Philip and Eben:

Raspberry Jam Big Birthday Weekend 2018

To celebrate the Raspberry Pi’s sixth birthday, we coordinated Raspberry Jams all over the world to take place over the Raspberry Jam Big Birthday Weekend, 3-4 March 2018. A massive thank you to everyone who ran an event and attended.

The Raspberry Jam photo booth

I put together code for a Pi-powered photo booth which overlaid the Big Birthday Weekend logo onto photos and (optionally) tweeted them. We included an arcade button in the Jam kits so they could build one — and it seemed to be quite popular. Some Jams put great effort into housing their photo booth:



Here are some of my favourite photo booth tweets:

RGVSA on Twitter

PiParty photo booth @RGVSA & @ @Nerdvana_io #Rjam

Denis Stretton on Twitter

The @SouthendRPIJams #PiParty photo booth

rpijamtokyo on Twitter

PiParty photo booth

Preston Raspberry Jam on Twitter

Preston Raspberry Jam Photobooth #RJam #PiParty

If you want to try out the photo booth software yourself, find the code on GitHub.

The great Raspberry Jam bake-off

Traditionally, in the UK, people have a cake on their birthday. And we had a few! We saw (and tasted) a great selection of Pi-themed cakes and other baked goods throughout the weekend:






Raspberry Jams everywhere

We always say that every Jam is different, but there’s a common and recognisable theme amongst them. It was great to see so many different venues around the world filling up with like-minded Pi enthusiasts, Raspberry Jam–branded banners, and Raspberry Pi balloons!

Europe

Sergio Martinez on Twitter

Thank you so much to all the attendees of the Ikana Jam in Krakow past Saturday! We shared fun experiences, some of them… also painful 😉 A big thank you to @Raspberry_Pi for these global celebrations! And a big thank you to @hubraum for their hospitality! #PiParty #rjam

NI Raspberry Jam on Twitter

We also had a super successful set of wearables workshops using @adafruit Circuit Playground Express boards and conductive thread at today’s @Raspberry_Pi Jam! Very popular! #PiParty

Suzystar on Twitter

My SenseHAT workshop, going well! @SouthendRPiJams #PiParty

Worksop College Raspberry Jam on Twitter

Learning how to scare the zombies in case of an apocalypse- it worked on our young learners #PiParty @worksopcollege @Raspberry_Pi https://t.co/pntEm57TJl

Africa

Rita on Twitter

Being one of the two places in Kenya where the #PiParty took place, it was an amazing time spending the day with this team and getting to learn and have fun. @TaitaTavetaUni and @Raspberry_Pi thank you for your support. @TTUTechlady @mictecttu ch

GABRIEL ONIFADE on Twitter

@TheMagP1

GABRIEL ONIFADE on Twitter

@GABONIAVERACITY #PiParty Lagos Raspberry Jam 2018 Special International Celebration – 6th Raspberry-Pi Big Birthday! Lagos Nigeria @Raspberry_Pi @ben_nuttall #RJam #RaspberryJam #raspberrypi #physicalcomputing #robotics #edtech #coding #programming #edTechAfrica #veracityhouse https://t.co/V7yLxaYGNx

North America

Heidi Baynes on Twitter

The Riverside Raspberry Jam @Vocademy is underway! #piparty

Brad Derstine on Twitter

The Philly & Pi #PiParty event with @Bresslergroup and @TechGirlzorg was awesome! The Scratch and Pi workshop was amazing! It was overall a great day of fun and tech!!! Thank you everyone who came out!

Houston Raspi on Twitter

Thanks everyone who came out to the @Raspberry_Pi Big Birthday Jam! Special thanks to @PBFerrell @estefanniegg @pcsforme @pandafulmanda @colnels @bquentin3 couldn’t’ve put on this amazing community event without you guys!

Merge Robotics 2706 on Twitter

We are back at @SciTechMuseum for the second day of @OttawaPiJam! Our robot Mergius loves playing catch with the kids! #pijam #piparty #omgrobots

South America

Javier Garzón on Twitter

Así terminamos el #Raspberry Jam Big Birthday Weekend #Bogota 2018 #PiParty de #RaspberryJamBogota 2018 @Raspberry_Pi Nos vemos el 7 de marzo en #ArduinoDayBogota 2018 y #RaspberryJamBogota 2018

Asia

Fablab UP Cebu on Twitter

Happy 6th birthday, @Raspberry_Pi! Greetings all the way from CEBU,PH! #PiParty #IoTCebu Thanks @CebuXGeeks X Ramos for these awesome pics. #Fablab #UPCebu

福野泰介 on Twitter

ラズパイ、6才のお誕生日会スタート in Tokyo PCNブースで、いろいろ展示とhttps://t.co/L6E7KgyNHFとIchigoJamつないだ、こどもIoTハッカソンmini体験やってます at 東京蒲田駅近 https://t.co/yHEuqXHvqe #piparty #pipartytokyo #rjam #opendataday

Ren Camp on Twitter

Happy birthday @Raspberry_Pi! #piparty #iotcebu @coolnumber9 https://t.co/2ESVjfRJ2d

Oceania

Glenunga Raspberry Pi Club on Twitter

PiParty photo booth

Personally, I managed to get to three Jams over the weekend: two run by the same people who put on the first two Jams to ever take place, and also one brand-new one! The Preston Raspberry Jam team, who usually run their event on a Monday evening, wanted to do something extra special for the birthday, so they came up with the idea of putting on a Raspberry Jam Sandwich — on the Friday and Monday around the weekend! This meant I was able to visit them on Friday, then attend the Manchester Raspberry Jam on Saturday, and finally drop by the new Jam at Worksop College on my way home on Sunday.

Ben Nuttall on Twitter

I’m at my first Raspberry Jam #PiParty event of the big birthday weekend! @PrestonRJam has been running for nearly 6 years and is a great place to start the celebrations!

Ben Nuttall on Twitter

Back at @McrRaspJam at @DigInnMMU for #PiParty

Ben Nuttall on Twitter

Great to see mine & @Frans_facts Balloon Pi-Tay popper project in action at @worksopjam #rjam #PiParty https://t.co/GswFm0UuPg

Various members of the Foundation team attended Jams around the UK and US, and James from the Code Club International team visited AmsterJam.

hackerfemo on Twitter

Thanks to everyone who came to our Jam and everyone who helped out. @phoenixtogether thanks for amazing cake & hosting. Ademir you’re so cool. It was awesome to meet Craig Morley from @Raspberry_Pi too. #PiParty

Stuart Fox on Twitter

Great #PiParty today at the @cotswoldjam with bloody delicious cake and lots of raspberry goodness. Great to see @ClareSutcliffe @martinohanlon playing on my new pi powered arcade build:-)

Clare Sutcliffe on Twitter

Happy 6th Birthday @Raspberry_Pi from everyone at the #PiParty at #cotswoldjam in Cheltenham!

Code Club on Twitter

It’s @Raspberry_Pi 6th birthday and we’re celebrating by taking part in @amsterjam__! Happy Birthday Raspberry Pi, we’re so happy to be a part of the family! #PiParty

For more Jammy birthday goodness, check out the PiParty hashtag on Twitter!

The Jam makers!

A lot of preparation went into each Jam, and we really appreciate all the hard work the Jam makers put in to making these events happen, on the Big Birthday Weekend and all year round. Thanks also to all the teams that sent us a group photo:

Lots of the Jams that took place were brand-new events, so we hope to see them continue throughout 2018 and beyond, growing the Raspberry Pi community around the world and giving more people, particularly youths, the opportunity to learn digital making skills.

Philip Colligan on Twitter

So many wonderful people in the @Raspberry_Pi community. Thanks to everyone at #PottonPiAndPints for a great afternoon and for everything you do to help young people learn digital making. #PiParty

Special thanks to ModMyPi for shipping the special Raspberry Jam kits all over the world!

Don’t forget to check out our Jam page to find an event near you! This is also where you can find free resources to help you get a new Jam started, and download free starter projects made especially for Jam activities. These projects are available in English, Français, Français Canadien, Nederlands, Deutsch, Italiano, and 日本語. If you’d like to help us translate more content into these and other languages, please get in touch!

PS Some of the UK Jams were postponed due to heavy snowfall, so you may find there’s a belated sixth-birthday Jam coming up where you live!

S Organ on Twitter

@TheMagP1 Ours was rescheduled until later in the Spring due to the snow but here is Babbage enjoying the snow!

The post Raspberry Jam Big Birthday Weekend 2018 roundup appeared first on Raspberry Pi.

Poor Security at the UK National Health Service

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/02/poor_security_a.html

The Guardian is reporting that “every NHS trust assessed for cyber security vulnerabilities has failed to meet the standard required.”

This is the same NHS that was debilitated by WannaCry.

EDITED TO ADD (2/13): More news.

And don’t think that US hospitals are much better.

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hamr-hard-drives/

HAMR drive illustration

During Q4, Backblaze deployed 100 petabytes worth of Seagate hard drives to our data centers. The newly deployed Seagate 10 and 12 TB drives are doing well and will help us meet our near term storage needs, but we know we’re going to need more drives — with higher capacities. That’s why the success of new hard drive technologies like Heat-Assisted Magnetic Recording (HAMR) from Seagate are very relevant to us here at Backblaze and to the storage industry in general. In today’s guest post we are pleased to have Mark Re, CTO at Seagate, give us an insider’s look behind the hard drive curtain to tell us how Seagate engineers are developing the HAMR technology and making it market ready starting in late 2018.

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Guest Blog Post by Mark Re, Seagate Senior Vice President and Chief Technology Officer

Earlier this year Seagate announced plans to make the first hard drives using Heat-Assisted Magnetic Recording, or HAMR, available by the end of 2018 in pilot volumes. Even as today’s market has embraced 10TB+ drives, the need for 20TB+ drives remains imperative in the relative near term. HAMR is the Seagate research team’s next major advance in hard drive technology.

HAMR is a technology that over time will enable a big increase in the amount of data that can be stored on a disk. A small laser is attached to a recording head, designed to heat a tiny spot on the disk where the data will be written. This allows a smaller bit cell to be written as either a 0 or a 1. The smaller bit cell size enables more bits to be crammed into a given surface area — increasing the areal density of data, and increasing drive capacity.

It sounds almost simple, but the science and engineering expertise required, the research, experimentation, lab development and product development to perfect this technology has been enormous. Below is an overview of the HAMR technology and you can dig into the details in our technical brief that provides a point-by-point rundown describing several key advances enabling the HAMR design.

As much time and resources as have been committed to developing HAMR, the need for its increased data density is indisputable. Demand for data storage keeps increasing. Businesses’ ability to manage and leverage more capacity is a competitive necessity, and IT spending on capacity continues to increase.

History of Increasing Storage Capacity

For the last 50 years areal density in the hard disk drive has been growing faster than Moore’s law, which is a very good thing. After all, customers from data centers and cloud service providers to creative professionals and game enthusiasts rarely go shopping looking for a hard drive just like the one they bought two years ago. The demands of increasing data on storage capacities inevitably increase, thus the technology constantly evolves.

According to the Advanced Storage Technology Consortium, HAMR will be the next significant storage technology innovation to increase the amount of storage in the area available to store data, also called the disk’s “areal density.” We believe this boost in areal density will help fuel hard drive product development and growth through the next decade.

Why do we Need to Develop Higher-Capacity Hard Drives? Can’t Current Technologies do the Job?

Why is HAMR’s increased data density so important?

Data has become critical to all aspects of human life, changing how we’re educated and entertained. It affects and informs the ways we experience each other and interact with businesses and the wider world. IDC research shows the datasphere — all the data generated by the world’s businesses and billions of consumer endpoints — will continue to double in size every two years. IDC forecasts that by 2025 the global datasphere will grow to 163 zettabytes (that is a trillion gigabytes). That’s ten times the 16.1 ZB of data generated in 2016. IDC cites five key trends intensifying the role of data in changing our world: embedded systems and the Internet of Things (IoT), instantly available mobile and real-time data, cognitive artificial intelligence (AI) systems, increased security data requirements, and critically, the evolution of data from playing a business background to playing a life-critical role.

Consumers use the cloud to manage everything from family photos and videos to data about their health and exercise routines. Real-time data created by connected devices — everything from Fitbit, Alexa and smart phones to home security systems, solar systems and autonomous cars — are fueling the emerging Data Age. On top of the obvious business and consumer data growth, our critical infrastructure like power grids, water systems, hospitals, road infrastructure and public transportation all demand and add to the growth of real-time data. Data is now a vital element in the smooth operation of all aspects of daily life.

All of this entails a significant infrastructure cost behind the scenes with the insatiable, global appetite for data storage. While a variety of storage technologies will continue to advance in data density (Seagate announced the first 60TB 3.5-inch SSD unit for example), high-capacity hard drives serve as the primary foundational core of our interconnected, cloud and IoT-based dependence on data.

HAMR Hard Drive Technology

Seagate has been working on heat assisted magnetic recording (HAMR) in one form or another since the late 1990s. During this time we’ve made many breakthroughs in making reliable near field transducers, special high capacity HAMR media, and figuring out a way to put a laser on each and every head that is no larger than a grain of salt.

The development of HAMR has required Seagate to consider and overcome a myriad of scientific and technical challenges including new kinds of magnetic media, nano-plasmonic device design and fabrication, laser integration, high-temperature head-disk interactions, and thermal regulation.

A typical hard drive inside any computer or server contains one or more rigid disks coated with a magnetically sensitive film consisting of tiny magnetic grains. Data is recorded when a magnetic write-head flies just above the spinning disk; the write head rapidly flips the magnetization of one magnetic region of grains so that its magnetic pole points up or down, to encode a 1 or a 0 in binary code.

Increasing the amount of data you can store on a disk requires cramming magnetic regions closer together, which means the grains need to be smaller so they won’t interfere with each other.

Heat Assisted Magnetic Recording (HAMR) is the next step to enable us to increase the density of grains — or bit density. Current projections are that HAMR can achieve 5 Tbpsi (Terabits per square inch) on conventional HAMR media, and in the future will be able to achieve 10 Tbpsi or higher with bit patterned media (in which discrete dots are predefined on the media in regular, efficient, very dense patterns). These technologies will enable hard drives with capacities higher than 100 TB before 2030.

The major problem with packing bits so closely together is that if you do that on conventional magnetic media, the bits (and the data they represent) become thermally unstable, and may flip. So, to make the grains maintain their stability — their ability to store bits over a long period of time — we need to develop a recording media that has higher coercivity. That means it’s magnetically more stable during storage, but it is more difficult to change the magnetic characteristics of the media when writing (harder to flip a grain from a 0 to a 1 or vice versa).

That’s why HAMR’s first key hardware advance required developing a new recording media that keeps bits stable — using high anisotropy (or “hard”) magnetic materials such as iron-platinum alloy (FePt), which resist magnetic change at normal temperatures. Over years of HAMR development, Seagate researchers have tested and proven out a variety of FePt granular media films, with varying alloy composition and chemical ordering.

In fact the new media is so “hard” that conventional recording heads won’t be able to flip the bits, or write new data, under normal temperatures. If you add heat to the tiny spot on which you want to write data, you can make the media’s coercive field lower than the magnetic field provided by the recording head — in other words, enable the write head to flip that bit.

So, a challenge with HAMR has been to replace conventional perpendicular magnetic recording (PMR), in which the write head operates at room temperature, with a write technology that heats the thin film recording medium on the disk platter to temperatures above 400 °C. The basic principle is to heat a tiny region of several magnetic grains for a very short time (~1 nanoseconds) to a temperature high enough to make the media’s coercive field lower than the write head’s magnetic field. Immediately after the heat pulse, the region quickly cools down and the bit’s magnetic orientation is frozen in place.

Applying this dynamic nano-heating is where HAMR’s famous “laser” comes in. A plasmonic near-field transducer (NFT) has been integrated into the recording head, to heat the media and enable magnetic change at a specific point. Plasmonic NFTs are used to focus and confine light energy to regions smaller than the wavelength of light. This enables us to heat an extremely small region, measured in nanometers, on the disk media to reduce its magnetic coercivity,

Moving HAMR Forward

HAMR write head

As always in advanced engineering, the devil — or many devils — is in the details. As noted earlier, our technical brief provides a point-by-point short illustrated summary of HAMR’s key changes.

Although hard work remains, we believe this technology is nearly ready for commercialization. Seagate has the best engineers in the world working towards a goal of a 20 Terabyte drive by 2019. We hope we’ve given you a glimpse into the amount of engineering that goes into a hard drive. Keeping up with the world’s insatiable appetite to create, capture, store, secure, manage, analyze, rapidly access and share data is a challenge we work on every day.

With thousands of HAMR drives already being made in our manufacturing facilities, our internal and external supply chain is solidly in place, and volume manufacturing tools are online. This year we began shipping initial units for customer tests, and production units will ship to key customers by the end of 2018. Prepare for breakthrough capacities.

The post What is HAMR and How Does It Enable the High-Capacity Needs of the Future? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

AWS IoT Update – Better Value with New Pricing Model

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-update-better-value-with-new-pricing-model/

Our customers are using AWS IoT to make their connected devices more intelligent. These devices collect & measure data in the field (below the ground, in the air, in the water, on factory floors and in hospital rooms) and use AWS IoT as their gateway to the AWS Cloud. Once connected to the cloud, customers can write device data to Amazon Simple Storage Service (S3) and Amazon DynamoDB, process data using Amazon Kinesis and AWS Lambda functions, initiate Amazon Simple Notification Service (SNS) push notifications, and much more.

New Pricing Model (20-40% Reduction)
Today we are making a change to the AWS IoT pricing model that will make it an even better value for you. Most customers will see a price reduction of 20-40%, with some receiving a significantly larger discount depending on their workload.

The original model was based on a charge for the number of messages that were sent to or from the service. This all-inclusive model was a good starting point, but also meant that some customers were effectively paying for parts of AWS IoT that they did not actually use. For example, some customers have devices that ping AWS IoT very frequently, with sparse rule sets that fire infrequently. Our new model is more fine-grained, with independent charges for each component (all prices are for devices that connect to the US East (Northern Virginia) Region):

Connectivity – Metered in 1 minute increments and based on the total time your devices are connected to AWS IoT. Priced at $0.08 per million minutes of connection (equivalent to $0.042 per device per year for 24/7 connectivity). Your devices can send keep-alive pings at 30 second to 20 minute intervals at no additional cost.

Messaging – Metered by the number of messages transmitted between your devices and AWS IoT. Pricing starts at $1 per million messages, with volume pricing falling as low as $0.70 per million. You may send and receive messages up to 128 kilobytes in size. Messages are metered in 5 kilobyte increments (up from 512 bytes previously). For example, an 8 kilobyte message is metered as two messages.

Rules Engine – Metered for each time a rule is triggered, and for the number of actions executed within a rule, with a minimum of one action per rule. Priced at $0.15 per million rules-triggered and $0.15 per million actions-executed. Rules that process a message in excess of 5 kilobytes are metered at the next multiple of the 5 kilobyte size. For example, a rule that processes an 8 kilobyte message is metered as two rules.

Device Shadow & Registry Updates – Metered on the number of operations to access or modify Device Shadow or Registry data, priced at $1.25 per million operations. Device Shadow and Registry operations are metered in 1 kilobyte increments of the Device Shadow or Registry record size. For example, an update to a 1.5 kilobyte Shadow record is metered as two operations.

The AWS Free Tier now offers a generous allocation of connection minutes, messages, triggered rules, rules actions, Shadow, and Registry usage, enough to operate a fleet of up to 50 devices. The new prices will take effect on January 1, 2018 with no effort on your part. At that time, the updated prices will be published on the AWS IoT Pricing page.

AWS IoT at re:Invent
We have an entire IoT track at this year’s AWS re:Invent. Here is a sampling:

We also have customer-led sessions from Philips, Panasonic, Enel, and Salesforce.

Jeff;

Hot Startups on AWS – October 2017

Post Syndicated from Tina Barr original https://aws.amazon.com/blogs/aws/hot-startups-on-aws-october-2017/

In 2015, the Centers for Medicare and Medicaid Services (CMS) reported that healthcare spending made up 17.8% of the U.S. GDP – that’s almost $3.2 trillion or $9,990 per person. By 2025, the CMS estimates this number will increase to nearly 20%. As cloud technology evolves in the healthcare and life science industries, we are seeing how companies of all sizes are using AWS to provide powerful and innovative solutions to customers across the globe. This month we are excited to feature the following startups:

  • ClearCare – helping home care agencies operate efficiently and grow their business.
  • DNAnexus – providing a cloud-based global network for sharing and managing genomic data.

ClearCare (San Francisco, CA)

ClearCare envisions a future where home care is the only choice for aging in place. Home care agencies play a critical role in the economy and their communities by significantly lowering the overall cost of care, reducing the number of hospital admissions, and bending the cost curve of aging. Patients receiving home care typically have multiple chronic conditions and functional limitations, driving over $190 billion in healthcare spending in the U.S. each year. To offset these costs, health insurance payers are developing in-home care management programs for patients. ClearCare’s goal is to help home care agencies leverage technology to improve costs, outcomes, and quality of life for the aging population. The company’s powerful software platform is specifically designed for use by non-medical, in-home care agencies to manage their businesses.

Founder and CEO Geoff Nudd created ClearCare because of his own grandmother’s need for care. Keeping family members and caregivers up to date on a loved one’s well being can be difficult, so Geoff created what is now ClearCare’s Family Room, which enables caregivers and agency staff to check schedules and receive real-time updates about what’s happening in the home. Since then, agencies have provided feedback on others areas of their businesses that could be streamlined. ClearCare has now built over 20 modules to help home care agencies optimize operations with services including a telephony service, billing and payroll, and more. ClearCare now serves over 4,000 home care agencies, representing 500,000 caregivers and 400,000 seniors.

Using AWS, ClearCare is able to spin up reliable infrastructure for proofs of concept and iterate on those systems to quickly get value to market. The company runs many AWS services including Amazon Elasticsearch Service, Amazon RDS, and Amazon CloudFront. Amazon EMR and Amazon Athena have enabled ClearCare to build a Hadoop-based ETL and data warehousing system that processes terabytes of data each day. By utilizing these managed services, ClearCare has been able to go from concept to customer delivery in less than three months.

To learn more about ClearCare, check out their website.

DNAnexus (Mountain View, CA)

DNAnexus is accelerating the application of genomic data in precision medicine by providing a cloud-based platform for sharing and managing genomic and biomedical data and analysis tools. The company was founded in 2009 by Stanford graduate student Andreas Sundquist and two Stanford professors Arend Sidow and Serafim Batzoglou, to address the need for scaling secondary analysis of next-generation sequencing (NGS) data in the cloud. The founders quickly learned that users needed a flexible solution to build complex analysis workflows and tools that enable them to share and manage large volumes of data. DNAnexus is optimized to address the challenges of security, scalability, and collaboration for organizations that are pursuing genomic-based approaches to health, both in clinics and research labs. DNAnexus has a global customer base – spanning North America, Europe, Asia-Pacific, South America, and Africa – that runs a million jobs each month and is doubling their storage year-over-year. The company currently stores more than 10 petabytes of biomedical and genomic data. That is equivalent to approximately 100,000 genomes, or in simpler terms, over 50 billion Facebook photos!

DNAnexus is working with its customers to help expand their translational informatics research, which includes expanding into clinical trial genomic services. This will help companies developing different medicines to better stratify clinical trial populations and develop companion tests that enable the right patient to get the right medicine. In collaboration with Janssen Human Microbiome Institute, DNAnexus is also launching Mosaic – a community platform for microbiome research.

AWS provides DNAnexus and its customers the flexibility to grow and scale research programs. Building the technology infrastructure required to manage these projects in-house is expensive and time-consuming. DNAnexus removes that barrier for labs of any size by using AWS scalable cloud resources. The company deploys its customers’ genomic pipelines on Amazon EC2, using Amazon S3 for high-performance, high-durability storage, and Amazon Glacier for low-cost data archiving. DNAnexus is also an AWS Life Sciences Competency Partner.

Learn more about DNAnexus here.

-Tina

AWS Hot Startups – August 2017

Post Syndicated from Tina Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-august-2017/

There’s no doubt about it – Artificial Intelligence is changing the world and how it operates. Across industries, organizations from startups to Fortune 500s are embracing AI to develop new products, services, and opportunities that are more efficient and accessible for their consumers. From driverless cars to better preventative healthcare to smart home devices, AI is driving innovation at a fast rate and will continue to play a more important role in our everyday lives.

This month we’d like to highlight startups using AI solutions to help companies grow. We are pleased to feature:

  • SignalBox – a simple and accessible deep learning platform to help businesses get started with AI.
  • Valossa – an AI video recognition platform for the media and entertainment industry.
  • Kaliber – innovative applications for businesses using facial recognition, deep learning, and big data.

SignalBox (UK)

In 2016, SignalBox founder Alain Richardt was hearing the same comments being made by developers, data scientists, and business leaders. They wanted to get into deep learning but didn’t know where to start. Alain saw an opportunity to commodify and apply deep learning by providing a platform that does the heavy lifting with an easy-to-use web interface, blueprints for common tasks, and just a single-click to productize the models. With SignalBox, companies can start building deep learning models with no coding at all – they just select a data set, choose a network architecture, and go. SignalBox also offers step-by-step tutorials, tips and tricks from industry experts, and consulting services for customers that want an end-to-end AI solution.

SignalBox offers a variety of solutions that are being used across many industries for energy modeling, fraud detection, customer segmentation, insurance risk modeling, inventory prediction, real estate prediction, and more. Existing data science teams are using SignalBox to accelerate their innovation cycle. One innovative UK startup, Energi Mine, recently worked with SignalBox to develop deep networks that predict anomalous energy consumption patterns and do time series predictions on energy usage for businesses with hundreds of sites.

SignalBox uses a variety of AWS services including Amazon EC2, Amazon VPC, Amazon Elastic Block Store, and Amazon S3. The ability to rapidly provision EC2 GPU instances has been a critical factor in their success – both in terms of keeping their operational expenses low, as well as speed to market. The Amazon API Gateway has allowed for operational automation, giving SignalBox the ability to control its infrastructure.

To learn more about SignalBox, visit here.

Valossa (Finland)

As students at the University of Oulu in Finland, the Valossa founders spent years doing research in the computer science and AI labs. During that time, the team witnessed how the world was moving beyond text, with video playing a greater role in day-to-day communication. This spawned an idea to use technology to automatically understand what an audience is viewing and share that information with a global network of content producers. Since 2015, Valossa has been building next generation AI applications to benefit the media and entertainment industry and is moving beyond the capabilities of traditional visual recognition systems.

Valossa’s AI is capable of analyzing any video stream. The AI studies a vast array of data within videos and converts that information into descriptive tags, categories, and overviews automatically. Basically, it sees, hears, and understands videos like a human does. The Valossa AI can detect people, visual and auditory concepts, key speech elements, and labels explicit content to make moderating and filtering content simpler. Valossa’s solutions are designed to provide value for the content production workflow, from media asset management to end-user applications for content discovery. AI-annotated content allows online viewers to jump directly to their favorite scenes or search specific topics and actors within a video.

Valossa leverages AWS to deliver the industry’s first complete AI video recognition platform. Using Amazon EC2 GPU instances, Valossa can easily scale their computation capacity based on customer activity. High-volume video processing with GPU instances provides the necessary speed for time-sensitive workflows. The geo-located Availability Zones in EC2 allow Valossa to bring resources close to their customers to minimize network delays. Valossa also uses Amazon S3 for video ingestion and to provide end-user video analytics, which makes managing and accessing media data easy and highly scalable.

To see how Valossa works, check out www.WhatIsMyMovie.com or enable the Alexa Skill, Valossa Movie Finder. To try the Valossa AI, sign up for free at www.valossa.com.

Kaliber (San Francisco, CA)

Serial entrepreneurs Ray Rahman and Risto Haukioja founded Kaliber in 2016. The pair had previously worked in startups building smart cities and online privacy tools, and teamed up to bring AI to the workplace and change the hospitality industry. Our world is designed to appeal to our senses – stores and warehouses have clearly marked aisles, products are colorfully packaged, and we use these designs to differentiate one thing from another. We tell each other apart by our faces, and previously that was something only humans could measure or act upon. Kaliber is using facial recognition, deep learning, and big data to create solutions for business use. Markets and companies that aren’t typically associated with cutting-edge technology will be able to use their existing camera infrastructure in a whole new way, making them more efficient and better able to serve their customers.

Computer video processing is rapidly expanding, and Kaliber believes that video recognition will extend to far more than security cameras and robots. Using the clients’ network of in-house cameras, Kaliber’s platform extracts key data points and maps them to actionable insights using their machine learning (ML) algorithm. Dashboards connect users to the client’s BI tools via the Kaliber enterprise APIs, and managers can view these analytics to improve their real-world processes, taking immediate corrective action with real-time alerts. Kaliber’s Real Metrics are aimed at combining the power of image recognition with ML to ultimately provide a more meaningful experience for all.

Kaliber uses many AWS services, including Amazon Rekognition, Amazon Kinesis, AWS Lambda, Amazon EC2 GPU instances, and Amazon S3. These services have been instrumental in helping Kaliber meet the needs of enterprise customers in record time.

Learn more about Kaliber here.

Thanks for reading and we’ll see you next month!

-Tina

 

AWS HIPAA Eligibility Update (July 2017) – Eight Additional Services

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-hipaa-eligibility-update-july-2017-eight-additional-services/

It is time for an update on our on-going effort to make AWS a great host for healthcare and life sciences applications. As you can see from our Health Customer Stories page, Philips, VergeHealth, and Cambia (to choose a few) trust AWS with Protected Health Information (PHI) and Personally Identifying Information (PII) as part of their efforts to comply with HIPAA and HITECH.

In May we announced that we added Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon Simple Queue Service (SQS) to our list of HIPAA eligible services and discussed our how customers and partners are putting them to use.

Eight More Eligible Services
Today I am happy to share the news that we are adding another eight services to the list:

Amazon CloudFront can now be utilized to enhance the delivery and transfer of Protected Health Information data to applications on the Internet. By providing a completely secure and encryptable pathway, CloudFront can now be used as a part of applications that need to cache PHI. This includes applications for viewing lab results or imaging data, and those that transfer PHI from Healthcare Information Exchanges (HIEs).

AWS WAF can now be used to protect applications running on AWS which operate on PHI such as patient care portals, patient scheduling systems, and HIEs. Requests and responses containing encrypted PHI and PII can now pass through AWS WAF.

AWS Shield can now be used to protect web applications such as patient care portals and scheduling systems that operate on encrypted PHI from DDoS attacks.

Amazon S3 Transfer Acceleration can now be used to accelerate the bulk transfer of large amounts of research, genetics, informatics, insurance, or payer/payment data containing PHI/PII information. Transfers can take place between a pair of AWS Regions or from an on-premises system and an AWS Region.

Amazon WorkSpaces can now be used by researchers, informaticists, hospital administrators and other users to analyze, visualize or process PHI/PII data using on-demand Windows virtual desktops.

AWS Directory Service can now be used to connect the authentication and authorization systems of organizations that use or process PHI/PII to their resources in the AWS Cloud. For example, healthcare providers operating hybrid cloud environments can now use AWS Directory Services to allow their users to easily transition between cloud and on-premises resources.

Amazon Simple Notification Service (SNS) can now be used to send notifications containing encrypted PHI/PII as part of patient care, payment processing, and mobile applications.

Amazon Cognito can now be used to authenticate users into mobile patient portal and payment processing applications that use PHI/PII identifiers for accounts.

Additional HIPAA Resources
Here are some additional resources that will help you to build applications that comply with HIPAA and HITECH:

Keep in Touch
In order to make use of any AWS service in any manner that involves PHI, you must first enter into an AWS Business Associate Addendum (BAA). You can contact us to start the process.

Jeff;

Analyze OpenFDA Data in R with Amazon S3 and Amazon Athena

Post Syndicated from Ryan Hood original https://aws.amazon.com/blogs/big-data/analyze-openfda-data-in-r-with-amazon-s3-and-amazon-athena/

One of the great benefits of Amazon S3 is the ability to host, share, or consume public data sets. This provides transparency into data to which an external data scientist or developer might not normally have access. By exposing the data to the public, you can glean many insights that would have been difficult with a data silo.

The openFDA project creates easy access to the high value, high priority, and public access data of the Food and Drug Administration (FDA). The data has been formatted and documented in consumer-friendly standards. Critical data related to drugs, devices, and food has been harmonized and can easily be called by application developers and researchers via API calls. OpenFDA has published two whitepapers that drill into the technical underpinnings of the API infrastructure as well as how to properly analyze the data in R. In addition, FDA makes openFDA data available on S3 in raw format.

In this post, I show how to use S3, Amazon EMR, and Amazon Athena to analyze the drug adverse events dataset. A drug adverse event is an undesirable experience associated with the use of a drug, including serious drug side effects, product use errors, product quality programs, and therapeutic failures.

Data considerations

Keep in mind that this data does have limitations. In addition, in the United States, these adverse events are submitted to the FDA voluntarily from consumers so there may not be reports for all events that occurred. There is no certainty that the reported event was actually due to the product. The FDA does not require that a causal relationship between a product and event be proven, and reports do not always contain the detail necessary to evaluate an event. Because of this, there is no way to identify the true number of events. The important takeaway to all this is that the information contained in this data has not been verified to produce cause and effect relationships. Despite this disclaimer, many interesting insights and value can be derived from the data to accelerate drug safety research.

Data analysis using SQL

For application developers who want to perform targeted searching and lookups, the API endpoints provided by the openFDA project are “ready to go” for software integration using a standard API powered by Elasticsearch, NodeJS, and Docker. However, for data analysis purposes, it is often easier to work with the data using SQL and statistical packages that expect a SQL table structure. For large-scale analysis, APIs often have query limits, such as 5000 records per query. This can cause extra work for data scientists who want to analyze the full dataset instead of small subsets of data.

To address the concern of requiring all the data in a single dataset, the openFDA project released the full 100 GB of harmonized data files that back the openFDA project onto S3. Athena is an interactive query service that makes it easy to analyze data in S3 using standard SQL. It’s a quick and easy way to answer your questions about adverse events and aspirin that does not require you to spin up databases or servers.

While you could point tools directly at the openFDA S3 files, you can find greatly improved performance and use of the data by following some of the preparation steps later in this post.

Architecture

This post explains how to use the following architecture to take the raw data provided by openFDA, leverage several AWS services, and derive meaning from the underlying data.

Steps:

  1. Load the openFDA /drug/event dataset into Spark and convert it to gzip to allow for streaming.
  2. Transform the data in Spark and save the results as a Parquet file in S3.
  3. Query the S3 Parquet file with Athena.
  4. Perform visualization and analysis of the data in R and Python on Amazon EC2.

Optimizing public data sets: A primer on data preparation

Those who want to jump right into preparing the files for Athena may want to skip ahead to the next section.

Transforming, or pre-processing, files is a common task for using many public data sets. Before you jump into the specific steps for transforming the openFDA data files into a format optimized for Athena, I thought it would be worthwhile to provide a quick exploration on the problem.

Making a dataset in S3 efficiently accessible with minimal transformation for the end user has two key elements:

  1. Partitioning the data into objects that contain a complete part of the data (such as data created within a specific month).
  2. Using file formats that make it easy for applications to locate subsets of data (for example, gzip, Parquet, ORC, etc.).

With these two key elements in mind, you can now apply transformations to the openFDA adverse event data to prepare it for Athena. You might find the data techniques employed in this post to be applicable to many of the questions you might want to ask of the public data sets stored in Amazon S3.

Before you get started, I encourage those who are interested in doing deeper healthcare analysis on AWS to make sure that you first read the AWS HIPAA Compliance whitepaper. This covers the information necessary for processing and storing patient health information (PHI).

Also, the adverse event analysis shown for aspirin is strictly for demonstration purposes and should not be used for any real decision or taken as anything other than a demonstration of AWS capabilities. However, there have been robust case studies published that have explored a causal relationship between aspirin and adverse reactions using OpenFDA data. If you are seeking research on aspirin or its risks, visit organizations such as the Centers for Disease Control and Prevention (CDC) or the Institute of Medicine (IOM).

Preparing data for Athena

For this walkthrough, you will start with the FDA adverse events dataset, which is stored as JSON files within zip archives on S3. You then convert it to Parquet for analysis. Why do you need to convert it? The original data download is stored in objects that are partitioned by quarter.

Here is a small sample of what you find in the adverse events (/drugs/event) section of the openFDA website.

If you were looking for events that happened in a specific quarter, this is not a bad solution. For most other scenarios, such as looking across the full history of aspirin events, it requires you to access a lot of data that you won’t need. The zip file format is not ideal for using data in place because zip readers must have random access to the file, which means the data can’t be streamed. Additionally, the zip files contain large JSON objects.

To read the data in these JSON files, a streaming JSON decoder must be used or a computer with a significant amount of RAM must decode the JSON. Opening up these files for public consumption is a great start. However, you still prepare the data with a few lines of Spark code so that the JSON can be streamed.

Step 1:  Convert the file types

Using Apache Spark on EMR, you can extract all of the zip files and pull out the events from the JSON files. To do this, use the Scala code below to deflate the zip file and create a text file. In addition, compress the JSON files with gzip to improve Spark’s performance and reduce your overall storage footprint. The Scala code can be run in either the Spark Shell or in an Apache Zeppelin notebook on your EMR cluster.

If you are unfamiliar with either Apache Zeppelin or the Spark Shell, the following posts serve as great references:

 

import scala.io.Source
import java.util.zip.ZipInputStream
import org.apache.spark.input.PortableDataStream
import org.apache.hadoop.io.compress.GzipCodec

// Input Directory
val inputFile = "s3://download.open.fda.gov/drug/event/2015q4/*.json.zip";

// Output Directory
val outputDir = "s3://{YOUR OUTPUT BUCKET HERE}/output/2015q4/";

// Extract zip files from 
val zipFiles = sc.binaryFiles(inputFile);

// Process zip file to extract the json as text file and save it
// in the output directory 
val rdd = zipFiles.flatMap((file: (String, PortableDataStream)) => {
    val zipStream = new ZipInputStream(file.2.open)
    val entry = zipStream.getNextEntry
    val iter = Source.fromInputStream(zipStream).getLines
    iter
}).map(.replaceAll("\s+","")).saveAsTextFile(outputDir, classOf[GzipCodec])

Step 2:  Transform JSON into Parquet

With just a few more lines of Scala code, you can use Spark’s abstractions to convert the JSON into a Spark DataFrame and then export the data back to S3 in Parquet format.

Spark requires the JSON to be in JSON Lines format to be parsed correctly into a DataFrame.

// Output Parquet directory
val outputDir = "s3://{YOUR OUTPUT BUCKET NAME}/output/drugevents"
// Input json file
val inputJson = "s3://{YOUR OUTPUT BUCKET NAME}/output/2015q4/*”
// Load dataframe from json file multiline 
val df = spark.read.json(sc.wholeTextFiles(inputJson).values)
// Extract results from dataframe
val results = df.select("results")
// Save it to Parquet
results.write.parquet(outputDir)

Step 3:  Create an Athena table

With the data cleanly prepared and stored in S3 using the Parquet format, you can now place an Athena table on top of it to get a better understanding of the underlying data.

Because the openFDA data structure incorporates several layers of nesting, it can be a complex process to try to manually derive the underlying schema in a Hive-compatible format. To shorten this process, you can load the top row of the DataFrame from the previous step into a Hive table within Zeppelin and then extract the “create  table” statement from SparkSQL.

results.createOrReplaceTempView("data")

val top1 = spark.sql("select * from data tablesample(1 rows)")

top1.write.format("parquet").mode("overwrite").saveAsTable("drugevents")

val show_cmd = spark.sql("show create table drugevents”).show(1, false)

This returns a “create table” statement that you can almost paste directly into the Athena console. Make some small modifications (adding the word “external” and replacing “using with “stored as”), and then execute the code in the Athena query editor. The table is created.

For the openFDA data, the DDL returns all string fields, as the date format used in your dataset does not conform to the yyy-mm-dd hh:mm:ss[.f…] format required by Hive. For your analysis, the string format works appropriately but it would be possible to extend this code to use a Presto function to convert the strings into time stamps.

CREATE EXTERNAL TABLE  drugevents (
   companynumb  string, 
   safetyreportid  string, 
   safetyreportversion  string, 
   receiptdate  string, 
   patientagegroup  string, 
   patientdeathdate  string, 
   patientsex  string, 
   patientweight  string, 
   serious  string, 
   seriousnesscongenitalanomali  string, 
   seriousnessdeath  string, 
   seriousnessdisabling  string, 
   seriousnesshospitalization  string, 
   seriousnesslifethreatening  string, 
   seriousnessother  string, 
   actiondrug  string, 
   activesubstancename  string, 
   drugadditional  string, 
   drugadministrationroute  string, 
   drugcharacterization  string, 
   drugindication  string, 
   drugauthorizationnumb  string, 
   medicinalproduct  string, 
   drugdosageform  string, 
   drugdosagetext  string, 
   reactionoutcome  string, 
   reactionmeddrapt  string, 
   reactionmeddraversionpt  string)
STORED AS parquet
LOCATION
  's3://{YOUR TARGET BUCKET}/output/drugevents'

With the Athena table in place, you can start to explore the data by running ad hoc queries within Athena or doing more advanced statistical analysis in R.

Using SQL and R to analyze adverse events

Using the openFDA data with Athena makes it very easy to translate your questions into SQL code and perform quick analysis on the data. After you have prepared the data for Athena, you can begin to explore the relationship between aspirin and adverse drug events, as an example. One of the most common metrics to measure adverse drug events is the Proportional Reporting Ratio (PRR). It is defined as:

PRR = (m/n)/( (M-m)/(N-n) )
Where
m = #reports with drug and event
n = #reports with drug
M = #reports with event in database
N = #reports in database

Gastrointestinal haemorrhage has the highest PRR of any reaction to aspirin when viewed in aggregate. One question you may want to ask is how the PRR has trended on a yearly basis for gastrointestinal haemorrhage since 2005.

Using the following query in Athena, you can see the PRR trend of “GASTROINTESTINAL HAEMORRHAGE” reactions with “ASPIRIN” since 2005:

with drug_and_event as 
(select rpad(receiptdate, 4, 'NA') as receipt_year
    , reactionmeddrapt
    , count(distinct (concat(safetyreportid,receiptdate,reactionmeddrapt))) as reports_with_drug_and_event 
from fda.drugevents
where rpad(receiptdate,4,'NA') 
     between '2005' and '2015' 
     and medicinalproduct = 'ASPIRIN'
     and reactionmeddrapt= 'GASTROINTESTINAL HAEMORRHAGE'
group by reactionmeddrapt, rpad(receiptdate, 4, 'NA') 
), reports_with_drug as 
(
select rpad(receiptdate, 4, 'NA') as receipt_year
    , count(distinct (concat(safetyreportid,receiptdate,reactionmeddrapt))) as reports_with_drug 
 from fda.drugevents 
 where rpad(receiptdate,4,'NA') 
     between '2005' and '2015' 
     and medicinalproduct = 'ASPIRIN'
group by rpad(receiptdate, 4, 'NA') 
), reports_with_event as 
(
   select rpad(receiptdate, 4, 'NA') as receipt_year
    , count(distinct (concat(safetyreportid,receiptdate,reactionmeddrapt))) as reports_with_event 
   from fda.drugevents
   where rpad(receiptdate,4,'NA') 
     between '2005' and '2015' 
     and reactionmeddrapt= 'GASTROINTESTINAL HAEMORRHAGE'
   group by rpad(receiptdate, 4, 'NA')
), total_reports as 
(
   select rpad(receiptdate, 4, 'NA') as receipt_year
    , count(distinct (concat(safetyreportid,receiptdate,reactionmeddrapt))) as total_reports 
   from fda.drugevents
   where rpad(receiptdate,4,'NA') 
     between '2005' and '2015' 
   group by rpad(receiptdate, 4, 'NA')
)
select  drug_and_event.receipt_year, 
(1.0 * drug_and_event.reports_with_drug_and_event/reports_with_drug.reports_with_drug)/ (1.0 * (reports_with_event.reports_with_event- drug_and_event.reports_with_drug_and_event)/(total_reports.total_reports-reports_with_drug.reports_with_drug)) as prr
, drug_and_event.reports_with_drug_and_event
, reports_with_drug.reports_with_drug
, reports_with_event.reports_with_event
, total_reports.total_reports
from drug_and_event
    inner join reports_with_drug on  drug_and_event.receipt_year = reports_with_drug.receipt_year   
    inner join reports_with_event on  drug_and_event.receipt_year = reports_with_event.receipt_year
    inner join total_reports on  drug_and_event.receipt_year = total_reports.receipt_year
order by  drug_and_event.receipt_year


One nice feature of Athena is that you can quickly connect to it via R or any other tool that can use a JDBC driver to visualize the data and understand it more clearly.

With this quick R script that can be run in R Studio either locally or on an EC2 instance, you can create a visualization of the PRR and Reporting Odds Ratio (RoR) for “GASTROINTESTINAL HAEMORRHAGE” reactions from “ASPIRIN” since 2005 to better understand these trends.

# connect to ATHENA
conn <- dbConnect(drv, '<Your JDBC URL>',s3_staging_dir="<Your S3 Location>",user=Sys.getenv(c("USER_NAME"),password=Sys.getenv(c("USER_PASSWORD"))

# Declare Adverse Event
adverseEvent <- "'GASTROINTESTINAL HAEMORRHAGE'"

# Build SQL Blocks
sqlFirst <- "SELECT rpad(receiptdate, 4, 'NA') as receipt_year, count(DISTINCT safetyreportid) as event_count FROM fda.drugsflat WHERE rpad(receiptdate,4,'NA') between '2005' and '2015'"
sqlEnd <- "GROUP BY rpad(receiptdate, 4, 'NA') ORDER BY receipt_year"

# Extract Aspirin with adverse event counts
sql <- paste(sqlFirst,"AND medicinalproduct ='ASPIRIN' AND reactionmeddrapt=",adverseEvent, sqlEnd,sep=" ")
aspirinAdverseCount = dbGetQuery(conn,sql)

# Extract Aspirin counts
sql <- paste(sqlFirst,"AND medicinalproduct ='ASPIRIN'", sqlEnd,sep=" ")
aspirinCount = dbGetQuery(conn,sql)

# Extract adverse event counts
sql <- paste(sqlFirst,"AND reactionmeddrapt=",adverseEvent, sqlEnd,sep=" ")
adverseCount = dbGetQuery(conn,sql)

# All Drug Adverse event Counts
sql <- paste(sqlFirst, sqlEnd,sep=" ")
allDrugCount = dbGetQuery(conn,sql)

# Select correct rows
selAll =  allDrugCount$receipt_year == aspirinAdverseCount$receipt_year
selAspirin = aspirinCount$receipt_year == aspirinAdverseCount$receipt_year
selAdverse = adverseCount$receipt_year == aspirinAdverseCount$receipt_year

# Calculate Numbers
m <- c(aspirinAdverseCount$event_count)
n <- c(aspirinCount[selAspirin,2])
M <- c(adverseCount[selAdverse,2])
N <- c(allDrugCount[selAll,2])

# Calculate proptional reporting ratio
PRR = (m/n)/((M-m)/(N-n))

# Calculate reporting Odds Ratio
d = n-m
D = N-M
ROR = (m/d)/(M/D)

# Plot the PRR and ROR
g_range <- range(0, PRR,ROR)
g_range[2] <- g_range[2] + 3
yearLen = length(aspirinAdverseCount$receipt_year)
axis(1,1:yearLen,lab=ax)
plot(PRR, type="o", col="blue", ylim=g_range,axes=FALSE, ann=FALSE)
axis(1,1:yearLen,lab=ax)
axis(2, las=1, at=1*0:g_range[2])
box()
lines(ROR, type="o", pch=22, lty=2, col="red")

As you can see, the PRR and RoR have both remained fairly steady over this time range. With the R Script above, all you need to do is change the adverseEvent variable from GASTROINTESTINAL HAEMORRHAGE to another type of reaction to analyze and compare those trends.

Summary

In this walkthrough:

  • You used a Scala script on EMR to convert the openFDA zip files to gzip.
  • You then transformed the JSON blobs into flattened Parquet files using Spark on EMR.
  • You created an Athena DDL so that you could query these Parquet files residing in S3.
  • Finally, you pointed the R package at the Athena table to analyze the data without pulling it into a database or creating your own servers.

If you have questions or suggestions, please comment below.


Next Steps

Take your skills to the next level. Learn how to optimize Amazon S3 for an architecture commonly used to enable genomic data analysis. Also, be sure to read more about running R on Amazon Athena.

 

 

 

 

 


About the Authors

Ryan Hood is a Data Engineer for AWS. He works on big data projects leveraging the newest AWS offerings. In his spare time, he enjoys watching the Cubs win the World Series and attempting to Sous-vide anything he can find in his refrigerator.

 

 

Vikram Anand is a Data Engineer for AWS. He works on big data projects leveraging the newest AWS offerings. In his spare time, he enjoys playing soccer and watching the NFL & European Soccer leagues.

 

 

Dave Rocamora is a Solutions Architect at Amazon Web Services on the Open Data team. Dave is based in Seattle and when he is not opening data, he enjoys biking and drinking coffee outside.

 

 

 

 

WannaCry and Vulnerabilities

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/wannacry_and_vu.html

There is plenty of blame to go around for the WannaCry ransomware that spread throughout the Internet earlier this month, disrupting work at hospitals, factories, businesses, and universities. First, there are the writers of the malicious software, which blocks victims’ access to their computers until they pay a fee. Then there are the users who didn’t install the Windows security patch that would have prevented an attack. A small portion of the blame falls on Microsoft, which wrote the insecure code in the first place. One could certainly condemn the Shadow Brokers, a group of hackers with links to Russia who stole and published the National Security Agency attack tools that included the exploit code used in the ransomware. But before all of this, there was the NSA, which found the vulnerability years ago and decided to exploit it rather than disclose it.

All software contains bugs or errors in the code. Some of these bugs have security implications, granting an attacker unauthorized access to or control of a computer. These vulnerabilities are rampant in the software we all use. A piece of software as large and complex as Microsoft Windows will contain hundreds of them, maybe more. These vulnerabilities have obvious criminal uses that can be neutralized if patched. Modern software is patched all the time — either on a fixed schedule, such as once a month with Microsoft, or whenever required, as with the Chrome browser.

When the US government discovers a vulnerability in a piece of software, however, it decides between two competing equities. It can keep it secret and use it offensively, to gather foreign intelligence, help execute search warrants, or deliver malware. Or it can alert the software vendor and see that the vulnerability is patched, protecting the country — and, for that matter, the world — from similar attacks by foreign governments and cybercriminals. It’s an either-or choice. As former US Assistant Attorney General Jack Goldsmith has said, “Every offensive weapon is a (potential) chink in our defense — and vice versa.”

This is all well-trod ground, and in 2010 the US government put in place an interagency Vulnerabilities Equities Process (VEP) to help balance the trade-off. The details are largely secret, but a 2014 blog post by then President Barack Obama’s cybersecurity coordinator, Michael Daniel, laid out the criteria that the government uses to decide when to keep a software flaw undisclosed. The post’s contents were unsurprising, listing questions such as “How much is the vulnerable system used in the core Internet infrastructure, in other critical infrastructure systems, in the US economy, and/or in national security systems?” and “Does the vulnerability, if left unpatched, impose significant risk?” They were balanced by questions like “How badly do we need the intelligence we think we can get from exploiting the vulnerability?” Elsewhere, Daniel has noted that the US government discloses to vendors the “overwhelming majority” of the vulnerabilities that it discovers — 91 percent, according to NSA Director Michael S. Rogers.

The particular vulnerability in WannaCry is code-named EternalBlue, and it was discovered by the US government — most likely the NSA — sometime before 2014. The Washington Post reported both how useful the bug was for attack and how much the NSA worried about it being used by others. It was a reasonable concern: many of our national security and critical infrastructure systems contain the vulnerable software, which imposed significant risk if left unpatched. And yet it was left unpatched.

There’s a lot we don’t know about the VEP. The Washington Post says that the NSA used EternalBlue “for more than five years,” which implies that it was discovered after the 2010 process was put in place. It’s not clear if all vulnerabilities are given such consideration, or if bugs are periodically reviewed to determine if they should be disclosed. That said, any VEP that allows something as dangerous as EternalBlue — or the Cisco vulnerabilities that the Shadow Brokers leaked last August to remain unpatched for years isn’t serving national security very well. As a former NSA employee said, the quality of intelligence that could be gathered was “unreal.” But so was the potential damage. The NSA must avoid hoarding vulnerabilities.

Perhaps the NSA thought that no one else would discover EternalBlue. That’s another one of Daniel’s criteria: “How likely is it that someone else will discover the vulnerability?” This is often referred to as NOBUS, short for “nobody but us.” Can the NSA discover vulnerabilities that no one else will? Or are vulnerabilities discovered by one intelligence agency likely to be discovered by another, or by cybercriminals?

In the past few months, the tech community has acquired some data about this question. In one study, two colleagues from Harvard and I examined over 4,300 disclosed vulnerabilities in common software and concluded that 15 to 20 percent of them are rediscovered within a year. Separately, researchers at the Rand Corporation looked at a different and much smaller data set and concluded that fewer than six percent of vulnerabilities are rediscovered within a year. The questions the two papers ask are slightly different and the results are not directly comparable (we’ll both be discussing these results in more detail at the Black Hat Conference in July), but clearly, more research is needed.

People inside the NSA are quick to discount these studies, saying that the data don’t reflect their reality. They claim that there are entire classes of vulnerabilities the NSA uses that are not known in the research world, making rediscovery less likely. This may be true, but the evidence we have from the Shadow Brokers is that the vulnerabilities that the NSA keeps secret aren’t consistently different from those that researchers discover. And given the alarming ease with which both the NSA and CIA are having their attack tools stolen, rediscovery isn’t limited to independent security research.

But even if it is difficult to make definitive statements about vulnerability rediscovery, it is clear that vulnerabilities are plentiful. Any vulnerabilities that are discovered and used for offense should only remain secret for as short a time as possible. I have proposed six months, with the right to appeal for another six months in exceptional circumstances. The United States should satisfy its offensive requirements through a steady stream of newly discovered vulnerabilities that, when fixed, also improve the country’s defense.

The VEP needs to be reformed and strengthened as well. A report from last year by Ari Schwartz and Rob Knake, who both previously worked on cybersecurity policy at the White House National Security Council, makes some good suggestions on how to further formalize the process, increase its transparency and oversight, and ensure periodic review of the vulnerabilities that are kept secret and used for offense. This is the least we can do. A bill recently introduced in both the Senate and the House calls for this and more.

In the case of EternalBlue, the VEP did have some positive effects. When the NSA realized that the Shadow Brokers had stolen the tool, it alerted Microsoft, which released a patch in March. This prevented a true disaster when the Shadow Brokers exposed the vulnerability on the Internet. It was only unpatched systems that were susceptible to WannaCry a month later, including versions of Windows so old that Microsoft normally didn’t support them. Although the NSA must take its share of the responsibility, no matter how good the VEP is, or how many vulnerabilities the NSA reports and the vendors fix, security won’t improve unless users download and install patches, and organizations take responsibility for keeping their software and systems up to date. That is one of the important lessons to be learned from WannaCry.

This essay originally appeared in Foreign Affairs.

Amazon EC2 Container Service – Launch Recap, Customer Stories, and Code

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-container-service-launch-recap-customer-stories-and-code/

Today seems like a good time to recap some of the features that we have added to Amazon EC2 Container Service over the last year or so, and to share some customer success stories and code with you! The service makes it easy for you to run any number of Docker containers across a managed cluster of EC2 instances, with full console, API, CloudFormation, CLI, and PowerShell support. You can store your Linux and Windows Docker images in the EC2 Container Registry for easy access.

Launch Recap
Let’s start by taking a look at some of the newest ECS features and some helpful how-to blog posts that will show you how to use them:

Application Load Balancing – We added support for the application load balancer last year. This high-performance load balancing option runs at the application level and allows you to define content-based routing rules. It provides support for dynamic ports and can be shared across multiple services, making it easier for you to run microservices in containers. To learn more, read about Service Load Balancing.

IAM Roles for Tasks – You can secure your infrastructure by assigning IAM roles to ECS tasks. This allows you to grant permissions on a fine-grained, per-task basis, customizing the permissions to the needs of each task. Read IAM Roles for Tasks to learn more.

Service Auto Scaling – You can define scaling policies that scale your services (tasks) up and down in response to changes in demand. You set the desired minimum and maximum number of tasks, create one or more scaling policies, and Service Auto Scaling will take care of the rest. The documentation for Service Auto Scaling will help you to make use of this feature.

Blox – Scheduling, in a container-based environment, is the process of assigning tasks to instances. ECS gives you three options: automated (via the built-in Service Scheduler), manual (via the RunTask function), and custom (via a scheduler that you provide). Blox is an open source scheduler that supports a one-task-per-host model, with room to accommodate other models in the future. It monitors the state of the cluster and is well-suited to running monitoring agents, log collectors, and other daemon-style tasks.

Windows – We launched ECS with support for Linux containers and followed up with support for running Windows Server 2016 Base with Containers.

Container Instance Draining – From time to time you may need to remove an instance from a running cluster in order to scale the cluster down or to perform a system update. Earlier this year we added a set of lifecycle hooks that allow you to better manage the state of the instances. Read the blog post How to Automate Container Instance Draining in Amazon ECS to see how to use the lifecycle hooks and a Lambda function to automate the process of draining existing work from an instance while preventing new work from being scheduled for it.

CI/CD Pipeline with Code* – Containers simplify software deployment and are an ideal target for a CI/CD (Continuous Integration / Continuous Deployment) pipeline. The post Continuous Deployment to Amazon ECS using AWS CodePipeline, AWS CodeBuild, Amazon ECR, and AWS CloudFormation shows you how to build and operate a CI/CD pipeline using multiple AWS services.

CloudWatch Logs Integration – This launch gave you the ability to configure the containers that run your tasks to send log information to CloudWatch Logs for centralized storage and analysis. You simply install the Amazon ECS Container Agent and enable the awslogs log driver.

CloudWatch Events – ECS generates CloudWatch Events when the state of a task or a container instance changes. These events allow you to monitor the state of the cluster using a Lambda function. To learn how to capture the events and store them in an Elasticsearch cluster, read Monitor Cluster State with Amazon ECS Event Stream.

Task Placement Policies – This launch provided you with fine-grained control over the placement of tasks on container instances within clusters. It allows you to construct policies that include cluster constraints, custom constraints (location, instance type, AMI, and attribute), placement strategies (spread or bin pack) and to use them without writing any code. Read Introducing Amazon ECS Task Placement Policies to see how to do this!

EC2 Container Service in Action
Many of our customers from large enterprises to hot startups and across all industries, such as financial services, hospitality, and consumer electronics, are using Amazon ECS to run their microservices applications in production. Companies such as Capital One, Expedia, Okta, Riot Games, and Viacom rely on Amazon ECS.

Mapbox is a platform for designing and publishing custom maps. The company uses ECS to power their entire batch processing architecture to collect and process over 100 million miles of sensor data per day that they use for powering their maps. They also optimize their batch processing architecture on ECS using Spot Instances. The Mapbox platform powers over 5,000 apps and reaches more than 200 million users each month. Its backend runs on ECS allowing it to serve more than 1.3 billion requests per day. To learn more about their recent migration to ECS, read their recent blog post, We Switched to Amazon ECS, and You Won’t Believe What Happened Next.

Travel company Expedia designed their backends with a microservices architecture. With the popularization of Docker, they decided they would like to adopt Docker for its faster deployments and environment portability. They chose to use ECS to orchestrate all their containers because it had great integration with the AWS platform, everything from ALB to IAM roles to VPC integration. This made ECS very easy to use with their existing AWS infrastructure. ECS really reduced the heavy lifting of deploying and running containerized applications. Expedia runs 75% of all apps on AWS in ECS allowing it to process 4 billion requests per hour. Read Kuldeep Chowhan‘s blog post, How Expedia Runs Hundreds of Applications in Production Using Amazon ECS to learn more.

Realtor.com provides home buyers and sellers with a comprehensive database of properties that are currently for sale. Their move to AWS and ECS has helped them to support business growth that now numbers 50 million unique monthly users who drive up to 250,000 requests per second at peak times. ECS has helped them to deploy their code more quickly while increasing utilization of their cloud infrastructure. Read the Realtor.com Case Study to learn more about how they use ECS, Kinesis, and other AWS services.

Instacart talks about how they use ECS to power their same-day grocery delivery service:

Capital One talks about how they use ECS to automate their operations and their infrastructure management:

Code
Clever developers are using ECS as a base for their own work. For example:

Rack is an open source PaaS (Platform as a Service). It focuses on infrastructure automation, runs in an isolated VPC, and uses a single-tenant build service for security.

Empire is also an open source PaaS. It provides a Heroku-like workflow and is targeted at small and medium sized startups, with an emphasis on microservices.

Cloud Container Cluster Visualizer (c3vis) helps to visualize resource utilization within ECS clusters:

Stay Tuned
We have plenty of new features in the works for ECS, so stay tuned!

Jeff;

 

WannaCry Ransomware

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/05/wannacry_ransom.html

Criminals go where the money is, and cybercriminals are no exception.

And right now, the money is in ransomware.

It’s a simple scam. Encrypt the victim’s hard drive, then extract a fee to decrypt it. The scammers can’t charge too much, because they want the victim to pay rather than give up on the data. But they can charge individuals a few hundred dollars, and they can charge institutions like hospitals a few thousand. Do it at scale, and it’s a profitable business.

And scale is how ransomware works. Computers are infected automatically, with viruses that spread over the internet. Payment is no more difficult than buying something online ­– and payable in untraceable bitcoin -­- with some ransomware makers offering tech support to those unsure of how to buy or transfer bitcoin. Customer service is important; people need to know they’ll get their files back once they pay.

And they want you to pay. If they’re lucky, they’ve encrypted your irreplaceable family photos, or the documents of a project you’ve been working on for weeks. Or maybe your company’s accounts receivable files or your hospital’s patient records. The more you need what they’ve stolen, the better.

The particular ransomware making headlines is called WannaCry, and it’s infected some pretty serious organizations.

What can you do about it? Your first line of defense is to diligently install every security update as soon as it becomes available, and to migrate to systems that vendors still support. Microsoft issued a security patch that protects against WannaCry months before the ransomware started infecting systems; it only works against computers that haven’t been patched. And many of the systems it infects are older computers, no longer normally supported by Microsoft –­ though it did belatedly release a patch for those older systems. I know it’s hard, but until companies are forced to maintain old systems, you’re much safer upgrading.

This is easier advice for individuals than for organizations. You and I can pretty easily migrate to a new operating system, but organizations sometimes have custom software that breaks when they change OS versions or install updates. Many of the organizations hit by WannaCry had outdated systems for exactly these reasons. But as expensive and time-consuming as updating might be, the risks of not doing so are increasing.

Your second line of defense is good antivirus software. Sometimes ransomware tricks you into encrypting your own hard drive by clicking on a file attachment that you thought was benign. Antivirus software can often catch your mistake and prevent the malicious software from running. This isn’t perfect, of course, but it’s an important part of any defense.

Your third line of defense is to diligently back up your files. There are systems that do this automatically for your hard drive. You can invest in one of those. Or you can store your important data in the cloud. If your irreplaceable family photos are in a backup drive in your house, then the ransomware has that much less hold on you. If your e-mail and documents are in the cloud, then you can just reinstall the operating system and bypass the ransomware entirely. I know storing data in the cloud has its own privacy risks, but they may be less than the risks of losing everything to ransomware.

That takes care of your computers and smartphones, but what about everything else? We’re deep into the age of the “Internet of things.”

There are now computers in your household appliances. There are computers in your cars and in the airplanes you travel on. Computers run our traffic lights and our power grids. These are all vulnerable to ransomware. The Mirai botnet exploited a vulnerability in internet-enabled devices like DVRs and webcams to launch a denial-of-service attack against a critical internet name server; next time it could just as easily disable the devices and demand payment to turn them back on.

Re-enabling a webcam will be cheap; re-enabling your car will cost more. And you don’t want to know how vulnerable implanted medical devices are to these sorts of attacks.

Commercial solutions are coming, probably a convenient repackaging of the three lines of defense described above. But it’ll be yet another security surcharge you’ll be expected to pay because the computers and internet-of-things devices you buy are so insecure. Because there are currently no liabilities for lousy software and no regulations mandating secure software, the market rewards software that’s fast and cheap at the expense of good. Until that changes, ransomware will continue to be profitable line of criminal business.

This essay previously appeared in the New York Daily News.

[$] Vulnerability hoarding and Wcry

Post Syndicated from jake original https://lwn.net/Articles/722924/rss

A virulent ransomware worm attacked a wide swath of Windows
machines worldwide in mid-May. The malware, known as Wcry, Wanna, or
WannaCry, infected a number of systems at high-profile organizations as
well as striking at critical pieces of the infrastructure—like hospitals, banks,
and train stations. While the threat seems to have largely abated—for
now—the origin of some of its code, which is apparently the US National Security
Agency (NSA), should give one pause.

Roundup of AWS HIPAA Eligible Service Announcements

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/roundup-of-aws-hipaa-eligible-service-announcements/

At AWS we have had a number of HIPAA eligible service announcements. Patrick Combes, the Healthcare and Life Sciences Global Technical Leader at AWS, and Aaron Friedman, a Healthcare and Life Sciences Partner Solutions Architect at AWS, have written this post to tell you all about it.

-Ana


We are pleased to announce that the following AWS services have been added to the BAA in recent weeks: Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon SQS. All four of these services facilitate moving data into and through AWS, and we are excited to see how customers will be using these services to advance their solutions in healthcare. While we know the use cases for each of these services are vast, we wanted to highlight some ways that customers might use these services with Protected Health Information (PHI).

As with all HIPAA-eligible services covered under the AWS Business Associate Addendum (BAA), PHI must be encrypted while at-rest or in-transit. We encourage you to reference our HIPAA whitepaper, which details how you might configure each of AWS’ HIPAA-eligible services to store, process, and transmit PHI. And of course, for any portion of your application that does not touch PHI, you can use any of our 90+ services to deliver the best possible experience to your users. You can find some ideas on architecting for HIPAA on our website.

Amazon API Gateway
Amazon API Gateway is a web service that makes it easy for developers to create, publish, monitor, and secure APIs at any scale. With PHI now able to securely transit API Gateway, applications such as patient/provider directories, patient dashboards, medical device reports/telemetry, HL7 message processing and more can securely accept and deliver information to any number and type of applications running within AWS or client presentation layers.

One particular area we are excited to see how our customers leverage Amazon API Gateway is with the exchange of healthcare information. The Fast Healthcare Interoperability Resources (FHIR) specification will likely become the next-generation standard for how health information is shared between entities. With strong support for RESTful architectures, FHIR can be easily codified within an API on Amazon API Gateway. For more information on FHIR, our AWS Healthcare Competency partner, Datica, has an excellent primer.

AWS Direct Connect
Some of our healthcare and life sciences customers, such as Johnson & Johnson, leverage hybrid architectures and need to connect their on-premises infrastructure to the AWS Cloud. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.

In addition to a hybrid-architecture strategy, AWS Direct Connect can assist with the secure migration of data to AWS, which is the first step to using the wide array of our HIPAA-eligible services to store and process PHI, such as Amazon S3 and Amazon EMR. Additionally, you can connect to third-party/externally-hosted applications or partner-provided solutions as well as securely and reliably connect end users to those same healthcare applications, such as a cloud-based Electronic Medical Record system.

AWS Database Migration Service (DMS)
To date, customers have migrated over 20,000 databases to AWS through the AWS Database Migration Service. Customers often use DMS as part of their cloud migration strategy, and now it can be used to securely and easily migrate your core databases containing PHI to the AWS Cloud. As your source database remains fully operational during the migration with DMS, you minimize downtime for these business-critical applications as you migrate your databases to AWS. This service can now be utilized to securely transfer such items as patient directories, payment/transaction record databases, revenue management databases and more into AWS.

Amazon Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) is a message queueing service for reliably communicating among distributed software components and microservices at any scale. One way that we envision customers using SQS with PHI is to buffer requests between application components that pass HL7 or FHIR messages to other parts of their application. You can leverage features like SQS FIFO to ensure your messages containing PHI are passed in the order they are received and delivered in the order they are received, and available until a consumer processes and deletes it. This is important for applications with patient record updates or processing payment information in a hospital.

Let’s get building!
We are beyond excited to see how our customers will use our newly HIPAA-eligible services as part of their healthcare applications. What are you most excited for? Leave a comment below.

AWS Hot Startups – April 2017

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-hot-startups-april-2017/

Spring is here, the flowers are blooming and Tina Barr is back with more great startups for you to check out!

-Ana


Welcome back to another month of hot AWS-powered startups! Today we have three exciting startups:

  • Beekeeper – simplifying employee communication in the workplace.
  • Betterment – making investing easier for everyone.
  • ClearSlide – a leading sales engagement platform.

Be sure to check out our March hot startups in case you missed them.

Beekeeper (Zurich, Switzerland)
Beekeeper logoFlavio Pfaffhauser and Christian Grossmann, both graduates of ETH Zurich, were passionate about building a technology that would connect and bring people together. What started as a student’s social community soon turned into Beekeeper – a communication platform for the workplace that allows employees to interact wherever they are. As Flavio and Christian learned how to build a social platform that engaged people properly, businesses began requesting a platform that could be adapted to their specific processes and needs. The platform started with the concept of helping people feel as if they are sitting right next to each other, whether they’re at a desk or in the field. Founded in 2012, Beekeeper is focused on improving information sharing, communication and peer collaboration, and the company strongly believes that listening to employees is crucial for organizations.

The “Mobile First, Desktop Friendly” platform has a simple and intuitive interface that easily integrates multiple operating systems into one ecosystem. The interface can be styled and customized to match a company’s brand and identity. Employees can connect with their colleagues anytime and anywhere with private and group chats, video and file sharing, and feedback surveys. With Beekeeper’s analytical dashboard leadership teams can identify trending topics of discussion and track employee engagement and app usage in real-time. Beekeeper is currently connecting users in 137 countries across industries including hospitality, construction, transportation, and more.

Beekeeper likes using AWS because it allows their engineers to focus on the things that really matter; solving customer issues. The company builds its infrastructure using services like Amazon EC2, Amazon S3, and Amazon RDS, all of which allow the technical teams to offload administrative tasks. Amazon Elastic Transcoder and Amazon QuickSight are used to build analytical dashboards and Amazon Redshift for data warehousing.

Check out the Beekeeper blog to keep up with their latest news!

Betterment (New York, NY)
Betterment logo
Betterment is on a mission to make investing easier and more accessible for everyone, no matter their financial goal. In 2008, Jon Stein founded Betterment with the intent to reinvent the industry and save future investors from making the same common mistakes he had been making. At that time, most people only had a couple of options when it came to investing their money – either do it yourself or hire another person to do it for you. Unfortunately, financial advisors are sometimes paid to recommend certain investments even if it’s not what is best for their clients. Betterment only chooses investments that are in their customer’s best interest and align with their financial goals. Today, they are the largest, independent online investment advisor managing more than $8 billion in assets for over 240,000 customers.

Betterment uses technology to make investing easier and more efficient, while also helping to increase after-tax returns. They offer a wide range of financial planning services that are personalized to their customer’s life goals. To start an investment plan, customers can input their age, retirement status, and annual income and Betterment will recommend how much money to invest and which type of account is the right choice. They will invest and manage it in a way that many traditional investment services can’t at a lower cost.

The engineers at Betterment are constantly working to build industry-changing technology as quickly as possible to help customers maximize their money. AWS gives Betterment the flexibility to easily provision infrastructure and offload functions to various services that once required entire teams to manage. When they first started in the cloud, Betterment was using standard implementations of Amazon EC2, Amazon RDS, and Amazon S3. Since they’ve gone all in with AWS, they have been leveraging services like Amazon Redshift, AWS Lambda, AWS Database Migration Service, Amazon Kinesis, Amazon DynamoDB, and more. Today, they are using over 20 AWS services to develop, test, and deploy features and enhancements on a daily basis.

Learn more about Betterment here.

ClearSlide (San Francisco, CA)
ClearSlide is one of today’s leading sales engagement platforms, offering a complete and integrated tool that makes every customer interaction successful. Since their founding in 2009, ClearSlide has looked for ways to improve customer experiences and have developed numerous enablement tools for sales leaders and teams, marketing, customer support teams, and more. The platform puts content, communication channels, and insights at their customer’s fingertips to help drive better decisions and manage opportunities. ClearSlide serves thousands of companies including Comcast, the Sacramento Kings, The Economist, and so far their customers have generated over 750 million minutes of engagement!

ClearSlide offers a solution for all parts of the sales process. For sales leaders, ClearSlide provides engagement dashboards to improve deal visibility, coaching, and sales forecast accuracy. For marketing and sales enablement teams, they guide sellers to the right content, at the right time, in the right context, and provide insight to maximize content ROI. For sales reps, ClearSlide integrates communications, content, and analytics in a single platform experience. Communications can be made across email, in-person or online meetings, web, or social. Today, ClearSlide customers report a 10-20% increase in closed deals, 25% decrease in onboarding time for new reps, and a 50-80% reduction in selling costs.

ClearSlide uses a range of AWS services, but Amazon EC2 and Amazon RDS have made the biggest impact on their business. EC2 enables them to easily scale compute capacity, which is critical for a fast-growing startup. It also provides consistency during deployment – from development and integration to staging and production. RDS reduces overhead and allows ClearSlide to scale their database infrastructure. Since AWS takes care of time-consuming database management tasks, ClearSlide sees a reduction in operations costs and can focus on being more strategic with their customers.

Watch this video to learn how LiveIntent reduced sales cycles by 22% using ClearSlide. Get all the latest updates by following them on Twitter!

Thanks for checking out another month of awesome AWS-powered startups!

-Tina

 

Querying OpenStreetMap with Amazon Athena

Post Syndicated from Seth Fitzsimmons original https://aws.amazon.com/blogs/big-data/querying-openstreetmap-with-amazon-athena/

This is a guest post by Seth Fitzsimmons, member of the 2017 OpenStreetMap US board of directors. Seth works with clients including the Humanitarian OpenStreetMap Team, Mapzen, the American Red Cross, and World Bank to craft innovative geospatial solutions.

OpenStreetMap (OSM) is a free, editable map of the world, created and maintained by volunteers and available for use under an open license. Companies and non-profits like Mapbox, Foursquare, Mapzen, the World Bank, the American Red Cross and others use OSM to provide maps, directions, and geographic context to users around the world.

In the 12 years of OSM’s existence, editors have created and modified several billion features (physical things on the ground like roads or buildings). The main PostgreSQL database that powers the OSM editing interface is now over 2TB and includes historical data going back to 2007. As new users join the open mapping community, more and more valuable data is being added to OpenStreetMap, requiring increasingly powerful tools, interfaces, and approaches to explore its vastness.

This post explains how anyone can use Amazon Athena to quickly query publicly available OSM data stored in Amazon S3 (updated weekly) as an AWS Public Dataset. Imagine that you work for an NGO interested in improving knowledge of and access to health centers in Africa. You might want to know what’s already been mapped, to facilitate the production of maps of surrounding villages, and to determine where infrastructure investments are likely to be most effective.

Note: If you run all the queries in this post, you will be charged approximately $1 based on the number of bytes scanned. All queries used in this post can be found in this GitHub gist.

What is OpenStreetMap?

As an open content project, regular OSM data archives are made available to the public via planet.openstreetmap.org in a few different formats (XML, PBF). This includes both snapshots of the current state of data in OSM as well as historical archives.

Working with “the planet” (as the data archives are referred to) can be unwieldy. Because it contains data spanning the entire world, the size of a single archive is on the order of 50 GB. The format is bespoke and extremely specific to OSM. The data is incredibly rich, interesting, and useful, but the size, format, and tooling can often make it very difficult to even start the process of asking complex questions.

Heavy users of OSM data typically download the raw data and import it into their own systems, tailored for their individual use cases, such as map rendering, driving directions, or general analysis. Now that OSM data is available in the Apache ORC format on Amazon S3, it’s possible to query the data using Athena without even downloading it.

How does Athena help?

You can use Athena along with data made publicly available via OSM on AWS. You don’t have to learn how to install, configure, and populate your own server instances and go through multiple steps to download and transform the data into a queryable form. Thanks to AWS and partners, a regularly updated copy of the planet file (available within hours of OSM’s weekly publishing schedule) is hosted on S3 and made available in a format that lends itself to efficient querying using Athena.

Asking questions with Athena involves registering the OSM planet file as a table and making SQL queries. That’s it. Nothing to download, nothing to configure, nothing to ingest. Athena distributes your queries and returns answers within seconds, even while querying over 9 years and billions of OSM elements.

You’re in control. S3 provides high availability for the data and Athena charges you per TB of data scanned. Plus, we’ve gone through the trouble of keeping scanning charges as small as possible by transcoding OSM’s bespoke format as ORC. All the hard work of transforming the data into something highly queryable and making it publicly available is done; you just need to bring some questions.

Registering Tables

The OSM Public Datasets consist of three tables:

  • planet
    Contains the current versions of all elements present in OSM.
  • planet_history
    Contains a historical record of all versions of all elements (even those that have been deleted).
  • changesets
    Contains information about changesets in which elements were modified (and which have a foreign key relationship to both the planet and planet_history tables).

To register the OSM Public Datasets within your AWS account so you can query them, open the Athena console (make sure you are using the us-east-1 region) to paste and execute the following table definitions:

planet

CREATE EXTERNAL TABLE planet (
  id BIGINT,
  type STRING,
  tags MAP<STRING,STRING>,
  lat DECIMAL(9,7),
  lon DECIMAL(10,7),
  nds ARRAY<STRUCT<ref: BIGINT>>,
  members ARRAY<STRUCT<type: STRING, ref: BIGINT, role: STRING>>,
  changeset BIGINT,
  timestamp TIMESTAMP,
  uid BIGINT,
  user STRING,
  version BIGINT
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/planet/';

planet_history

CREATE EXTERNAL TABLE planet_history (
    id BIGINT,
    type STRING,
    tags MAP<STRING,STRING>,
    lat DECIMAL(9,7),
    lon DECIMAL(10,7),
    nds ARRAY<STRUCT<ref: BIGINT>>,
    members ARRAY<STRUCT<type: STRING, ref: BIGINT, role: STRING>>,
    changeset BIGINT,
    timestamp TIMESTAMP,
    uid BIGINT,
    user STRING,
    version BIGINT,
    visible BOOLEAN
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/planet-history/';

changesets

CREATE EXTERNAL TABLE changesets (
    id BIGINT,
    tags MAP<STRING,STRING>,
    created_at TIMESTAMP,
    open BOOLEAN,
    closed_at TIMESTAMP,
    comments_count BIGINT,
    min_lat DECIMAL(9,7),
    max_lat DECIMAL(9,7),
    min_lon DECIMAL(10,7),
    max_lon DECIMAL(10,7),
    num_changes BIGINT,
    uid BIGINT,
    user STRING
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/changesets/';

 

Under the Hood: Extract, Transform, Load

So, what happens behind the scenes to make this easier for you? In a nutshell, the data is transcoded from the OSM PBF format into Apache ORC.

There’s an AWS Lambda function (running every 15 minutes, triggered by CloudWatch Events) that watches planet.openstreetmap.org for the presence of weekly updates (using rsync). If that function detects that a new version has become available, it submits a set of AWS Batch jobs to mirror, transcode, and place it as the “latest” version. Code for this is available at osm-pds-pipelines GitHub repo.

To facilitate the transcoding into a format appropriate for Athena, we have produced an open source tool, OSM2ORC. The tool also includes an Osmosis plugin that can be used with complex filtering pipelines and outputs an ORC file that can be uploaded to S3 for use with Athena, or used locally with other tools from the Hadoop ecosystem.

What types of questions can OpenStreetMap answer?

There are many uses for OpenStreetMap data; here are three major ones and how they may be addressed using Athena.

Case Study 1: Finding Local Health Centers in West Africa

When the American Red Cross mapped more than 7,000 communities in West Africa in areas affected by the Ebola epidemic as part of the Missing Maps effort, they found themselves in a position where collecting a wide variety of data was both important and incredibly beneficial for others. Accurate maps play a critical role in understanding human communities, especially for populations at risk. The lack of detailed maps for West Africa posed a problem during the 2014 Ebola crisis, so collecting and producing data around the world has the potential to improve disaster responses in the future.

As part of the data collection, volunteers collected locations and information about local health centers, something that will facilitate treatment in future crises (and, more importantly, on a day-to-day basis). Combined with information about access to markets and clean drinking water and historical experiences with natural disasters, this data was used to create a vulnerability index to select communities for detailed mapping.

For this example, you find all health centers in West Africa (many of which were mapped as part of Missing Maps efforts). This is something that healthsites.io does for the public (worldwide and editable, based on OSM data), but you’re working with the raw data.

Here’s a query to fetch information about all health centers, tagged as nodes (points), in Guinea, Sierra Leone, and Liberia:

SELECT * from planet
WHERE type = 'node'
  AND tags['amenity'] IN ('hospital', 'clinic', 'doctors')
  AND lon BETWEEN -15.0863 AND -7.3651
  AND lat BETWEEN 4.3531 AND 12.6762;

Buildings, as “ways” (polygons, in this case) assembled from constituent nodes (points), can also be tagged as medical facilities. In order to find those, you need to reassemble geometries. Here you’re taking the average of all nodes that make up a building (which will be the approximate center point, which is close enough for this purpose). Here is a query that finds both buildings and points that are tagged as medical facilities:

-- select out nodes and relevant columns
WITH nodes AS (
  SELECT
    type,
    id,
    tags,
    lat,
    lon
  FROM planet
  WHERE type = 'node'
),
-- select out ways and relevant columns
ways AS (
  SELECT
    type,
    id,
    tags,
    nds
  FROM planet
  WHERE type = 'way'
    AND tags['amenity'] IN ('hospital', 'clinic', 'doctors')
),
-- filter nodes to only contain those present within a bounding box
nodes_in_bbox AS (
  SELECT *
  FROM nodes
  WHERE lon BETWEEN -15.0863 AND -7.3651
    AND lat BETWEEN 4.3531 AND 12.6762
)
-- find ways intersecting the bounding box
SELECT
  ways.type,
  ways.id,
  ways.tags,
  AVG(nodes.lat) lat,
  AVG(nodes.lon) lon
FROM ways
CROSS JOIN UNNEST(nds) AS t (nd)
JOIN nodes_in_bbox nodes ON nodes.id = nd.ref
GROUP BY (ways.type, ways.id, ways.tags)
UNION ALL
SELECT
  type,
  id,
  tags,
  lat,
  lon
FROM nodes_in_bbox
WHERE tags['amenity'] IN ('hospital', 'clinic', 'doctors');

You could go a step further and query for additional tags included with these (for example, opening_hours) and use that as a metric for measuring “completeness” of the dataset and to focus on additional data to collect (and locations to fill out).

Case Study 2: Generating statistics about mapathons

OSM has a history of holding mapping parties. Mapping parties are events where interested people get together and wander around outside, gathering and improving information about sites (and sights) that they pass. Another form of mapping party is the mapathon, which brings together armchair and desk mappers to focus on improving data in another part of the world.

Mapathons are a popular way of enlisting volunteers for Missing Maps, a collaboration between many NGOs, education institutions, and civil society groups that aims to map the most vulnerable places in the developing world to support international and local NGOs and individuals. One common way that volunteers participate is to trace buildings and roads from aerial imagery, providing baseline data that can later be verified by Missing Maps staff and volunteers working in the areas being mapped.

(Image and data from the American Red Cross)

Data collected during these events lends itself to a couple different types of questions. People like competition, so Missing Maps has developed a set of leaderboards that allow people to see where they stand relative to other mappers and how different groups compare. To facilitate this, hashtags (such as #missingmaps) are included in OSM changeset comments. To do similar ad hoc analysis, you need to query the list of changesets, filter by the presence of certain hashtags in the comments, and group things by username.

Now, find changes made during Missing Maps mapathons at George Mason University (using the #gmu hashtag):

SELECT *
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#gmu');

This includes all tags associated with a changeset, which typically include a mapper-provided comment about the changes made (often with additional hashtags corresponding to OSM Tasking Manager projects) as well as information about the editor used, imagery referenced, etc.

If you’re interested in the number of individual users who have mapped as part of the Missing Maps project, you can write a query such as:

SELECT COUNT(DISTINCT uid)
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#missingmaps');

25,610 people (as of this writing)!

Back at GMU, you’d like to know who the most prolific mappers are:

SELECT user, count(*) AS edits
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#gmu')
GROUP BY user
ORDER BY count(*) DESC;

Nice job, BrokenString!

It’s also interesting to see what types of features were added or changed. You can do that by using a JOIN between the changesets and planet tables:

SELECT planet.*, changesets.tags
FROM planet
JOIN changesets ON planet.changeset = changesets.id
WHERE regexp_like(changesets.tags['comment'], '(?i)#gmu');


Using this as a starting point, you could break down the types of features, highlight popular parts of the world, or do something entirely different.

Case Study 3: Building Condition

With building outlines having been produced by mappers around (and across) the world, local Missing Maps volunteers (often from local Red Cross / Red Crescent societies) go around with Android phones running OpenDataKit  and OpenMapKit to verify that the buildings in question actually exist and to add additional information about them, such as the number of stories, use (residential, commercial, etc.), material, and condition.

This data can be used in many ways: it can provide local geographic context (by being included in map source data) as well as facilitate investment by development agencies such as the World Bank.

Here are a collection of buildings mapped in Dhaka, Bangladesh:

(Map and data © OpenStreetMap contributors)

For NGO staff to determine resource allocation, it can be helpful to enumerate and show buildings in varying conditions. Building conditions within an area can be a means of understanding where to focus future investments.

Querying for buildings is a bit more complicated than working with points or changesets. Of the three core OSM element types—node, way, and relation, only nodes (points) have geographic information associated with them. Ways (lines or polygons) are composed of nodes and inherit vertices from them. This means that ways must be reconstituted in order to effectively query by bounding box.

This results in a fairly complex query. You’ll notice that this is similar to the query used to find buildings tagged as medical facilities above. Here you’re counting buildings in Dhaka according to building condition:

-- select out nodes and relevant columns
WITH nodes AS (
  SELECT
    id,
    tags,
    lat,
    lon
  FROM planet
  WHERE type = 'node'
),
-- select out ways and relevant columns
ways AS (
  SELECT
    id,
    tags,
    nds
  FROM planet
  WHERE type = 'way'
),
-- filter nodes to only contain those present within a bounding box
nodes_in_bbox AS (
  SELECT *
  FROM nodes
  WHERE lon BETWEEN 90.3907 AND 90.4235
    AND lat BETWEEN 23.6948 AND 23.7248
),
-- fetch and expand referenced ways
referenced_ways AS (
  SELECT
    ways.*,
    t.*
  FROM ways
  CROSS JOIN UNNEST(nds) WITH ORDINALITY AS t (nd, idx)
  JOIN nodes_in_bbox nodes ON nodes.id = nd.ref
),
-- fetch *all* referenced nodes (even those outside the queried bounding box)
exploded_ways AS (
  SELECT
    ways.id,
    ways.tags,
    idx,
    nd.ref,
    nodes.id node_id,
    ARRAY[nodes.lat, nodes.lon] coordinates
  FROM referenced_ways ways
  JOIN nodes ON nodes.id = nd.ref
  ORDER BY ways.id, idx
)
-- query ways matching the bounding box
SELECT
  count(*),
  tags['building:condition']
FROM exploded_ways
GROUP BY tags['building:condition']
ORDER BY count(*) DESC;


Most buildings are unsurveyed (125,000 is a lot!), but of those that have been, most are average (as you’d expect). If you were to further group these buildings geographically, you’d have a starting point to determine which areas of Dhaka might benefit the most.

Conclusion

OSM data, while incredibly rich and valuable, can be difficult to work with, due to both its size and its data model. In addition to the time spent downloading large files to work with locally, time is spent installing and configuring tools and converting the data into more queryable formats. We think Amazon Athena combined with the ORC version of the planet file, updated on a weekly basis, is an extremely powerful and cost-effective combination. This allows anyone to start querying billions of records with simple SQL in no time, giving you the chance to focus on the analysis, not the infrastructure.

To download the data and experiment with it using other tools, the latest OSM ORC-formatted file is available via OSM on AWS at s3://osm-pds/planet/planet-latest.orcs3://osm-pds/planet-history/history-latest.orc, and s3://osm-pds/changesets/changesets-latest.orc.

We look forward to hearing what you find out!

AWS Hot Startups – March 2017

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-hot-startups-march-2017/

As the madness of March rounds up, take a break from all the basketball and check out the cool startups Tina Barr brings you for this month!

-Ana


The arrival of spring brings five new startups this month:

  • Amino Apps – providing social networks for hundreds of thousands of communities.
  • Appboy – empowering brands to strengthen customer relationships.
  • Arterys – revolutionizing the medical imaging industry.
  • Protenus – protecting patient data for healthcare organizations.
  • Syapse – improving targeted cancer care with shared data from across the country.

In case you missed them, check out February’s hot startups here.

Amino Apps (New York, NY)
Amino Logo
Amino Apps was founded on the belief that interest-based communities were underdeveloped and outdated, particularly when it came to mobile. CEO Ben Anderson and CTO Yin Wang created the app to give users access to hundreds of thousands of communities, each of them a complete social network dedicated to a single topic. Some of the largest communities have over 1 million members and are built around topics like popular TV shows, video games, sports, and an endless number of hobbies and other interests. Amino hosts communities from around the world and is currently available in six languages with many more on the way.

Navigating the Amino app is easy. Simply download the app (iOS or Android), sign up with a valid email address, choose a profile picture, and start exploring. Users can search for communities and join any that fit their interests. Each community has chatrooms, multimedia content, quizzes, and a seamless commenting system. If a community doesn’t exist yet, users can create it in minutes using the Amino Creator and Manager app (ACM). The largest user-generated communities are turned into their own apps, which gives communities their own piece of real estate on members’ phones, as well as in app stores.

Amino’s vast global network of hundreds of thousands of communities is run on AWS services. Every day users generate, share, and engage with an enormous amount of content across hundreds of mobile applications. By leveraging AWS services including Amazon EC2, Amazon RDS, Amazon S3, Amazon SQS, and Amazon CloudFront, Amino can continue to provide new features to their users while scaling their service capacity to keep up with user growth.

Interested in joining Amino? Check out their jobs page here.

Appboy (New York, NY)
In 2011, Bill Magnuson, Jon Hyman, and Mark Ghermezian saw a unique opportunity to strengthen and humanize relationships between brands and their customers through technology. The trio created Appboy to empower brands to build long-term relationships with their customers and today they are the leading lifecycle engagement platform for marketing, growth, and engagement teams. The team recognized that as rapid mobile growth became undeniable, many brands were becoming frustrated with the lack of compelling and seamless cross-channel experiences offered by existing marketing clouds. Many of today’s top mobile apps and enterprise companies trust Appboy to take their marketing to the next level. Appboy manages user profiles for nearly 700 million monthly active users, and is used to power more than 10 billion personalized messages monthly across a multitude of channels and devices.

Appboy creates a holistic user profile that offers a single view of each customer. That user profile in turn powers contextual cross-channel messaging, lifecycle engagement automation, and robust campaign insights and optimization opportunities. Appboy offers solutions that allow brands to create push notifications, targeted emails, in-app and in-browser messages, news feed cards, and webhooks to enhance the user experience and increase customer engagement. The company prides itself on its interoperability, connecting to a variety of complimentary marketing tools and technologies so brands can build the perfect stack to enable their strategies and experiments in real time.

AWS makes it easy for Appboy to dynamically size all of their service components and automatically scale up and down as needed. They use an array of services including Elastic Load Balancing, AWS Lambda, Amazon CloudWatch, Auto Scaling groups, and Amazon S3 to help scale capacity and better deal with unpredictable customer loads.

To keep up with the latest marketing trends and tactics, visit the Appboy digital magazine, Relate. Appboy was also recently featured in the #StartupsOnAir video series where they gave insight into their AWS usage.

Arterys (San Francisco, CA)
Getting test results back from a physician can often be a time consuming and tedious process. Clinicians typically employ a variety of techniques to manually measure medical images and then make their assessments. Arterys founders Fabien Beckers, John Axerio-Cilies, Albert Hsiao, and Shreyas Vasanawala realized that much more computation and advanced analytics were needed to harness all of the valuable information in medical images, especially those generated by MRI and CT scanners. Clinicians were often skipping measurements and making assessments based mostly on qualitative data. Their solution was to start a cloud/AI software company focused on accelerating data-driven medicine with advanced software products for post-processing of medical images.

Arterys’ products provide timely, accurate, and consistent quantification of images, improve speed to results, and improve the quality of the information offered to the treating physician. This allows for much better tracking of a patient’s condition, and thus better decisions about their care. Advanced analytics, such as deep learning and distributed cloud computing, are used to process images. The first Arterys product can contour cardiac anatomy as accurately as experts, but takes only 15-20 seconds instead of the 45-60 minutes required to do it manually. Their computing cloud platform is also fully HIPAA compliant.

Arterys relies on a variety of AWS services to process their medical images. Using deep learning and other advanced analytic tools, Arterys is able to render images without latency over a web browser using AWS G2 instances. They use Amazon EC2 extensively for all of their compute needs, including inference and rendering, and Amazon S3 is used to archive images that aren’t needed immediately, as well as manage costs. Arterys also employs Amazon Route 53, AWS CloudTrail, and Amazon EC2 Container Service.

Check out this quick video about the technology that Arterys is creating. They were also recently featured in the #StartupsOnAir video series and offered a quick demo of their product.

Protenus (Baltimore, MD)
Protenus Logo
Protenus founders Nick Culbertson and Robert Lord were medical students at Johns Hopkins Medical School when they saw first-hand how Electronic Health Record (EHR) systems could be used to improve patient care and share clinical data more efficiently. With increased efficiency came a huge issue – an onslaught of serious security and privacy concerns. Over the past two years, 140 million medical records have been breached, meaning that approximately 1 in 3 Americans have had their health data compromised. Health records contain a repository of sensitive information and a breach of that data can cause major havoc in a patient’s life – namely identity theft, prescription fraud, Medicare/Medicaid fraud, and improper performance of medical procedures. Using their experience and knowledge from former careers in the intelligence community and involvement in a leading hedge fund, Nick and Robert developed the prototype and algorithms that launched Protenus.

Today, Protenus offers a number of solutions that detect breaches and misuse of patient data for healthcare organizations nationwide. Using advanced analytics and AI, Protenus’ health data insights platform understands appropriate vs. inappropriate use of patient data in the EHR. It also protects privacy, aids compliance with HIPAA regulations, and ensures trust for patients and providers alike.

Protenus built and operates its SaaS offering atop Amazon EC2, where Dedicated Hosts and encrypted Amazon EBS volume are used to ensure compliance with HIPAA regulation for the storage of Protected Health Information. They use Elastic Load Balancing and Amazon Route 53 for DNS, enabling unique, secure client specific access points to their Protenus instance.

To learn more about threats to patient data, read Hospitals’ Biggest Threat to Patient Data is Hiding in Plain Sight on the Protenus blog. Also be sure to check out their recent video in the #StartupsOnAir series for more insight into their product.

Syapse (Palo Alto, CA)
Syapse provides a comprehensive software solution that enables clinicians to treat patients with precision medicine for targeted cancer therapies — treatments that are designed and chosen using genetic or molecular profiling. Existing hospital IT doesn’t support the robust infrastructure and clinical workflows required to treat patients with precision medicine at scale, but Syapse centralizes and organizes patient data to clinicians at the point of care. Syapse offers a variety of solutions for oncologists that allow them to access the full scope of patient data longitudinally, view recommended treatments or clinical trials for similar patients, and track outcomes over time. These solutions are helping health systems across the country to improve patient outcomes by offering the most innovative care to cancer patients.

Leading health systems such as Stanford Health Care, Providence St. Joseph Health, and Intermountain Healthcare are using Syapse to improve patient outcomes, streamline clinical workflows, and scale their precision medicine programs. A group of experts known as the Molecular Tumor Board (MTB) reviews complex cases and evaluates patient data, documents notes, and disseminates treatment recommendations to the treating physician. Syapse also provides reports that give health system staff insight into their institution’s oncology care, which can be used toward quality improvement, business goals, and understanding variables in the oncology service line.

Syapse uses Amazon Virtual Private Cloud, Amazon EC2 Dedicated Instances, and Amazon Elastic Block Store to build a high-performance, scalable, and HIPAA-compliant data platform that enables health systems to make precision medicine part of routine cancer care for patients throughout the country.

Be sure to check out the Syapse blog to learn more and also their recent video on the #StartupsOnAir video series where they discuss their product, HIPAA compliance, and more about how they are using AWS.

Thank you for checking out another month of awesome hot startups!

-Tina Barr

 

AWS Marketplace Adds Healthcare & Life Sciences Category

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-marketplace-adds-healthcare-life-sciences-category/

Wilson To and Luis Daniel Soto are our guest bloggers today, telling you about a new industry vertical category that is being added to the AWS Marketplace.Check it out!

-Ana


AWS Marketplace is a managed and curated software catalog that helps customers innovate faster and reduce costs, by making it easy to discover, evaluate, procure, immediately deploy and manage 3rd party software solutions.  To continue supporting our customers, we’re now adding a new industry vertical category: Healthcare & Life Sciences.

healthpost

This new category brings together best-of-breed software tools and solutions from our growing vendor ecosystem that have been adapted to, or built from the ground up, to serve the healthcare and life sciences industry.

Healthcare
Within the AWS Marketplace HCLS category, you can find solutions for Clinical information systems, population health and analytics, health administration and compliance services. Some offerings include:

  1. Allgress GetCompliant HIPAA Edition – Reduce the cost of compliance management and adherence by providing compliance professionals improved efficiency by automating the management of their compliance processes around HIPAA.
  2. ZH Healthcare BlueEHS – Deploy a customizable, ONC-certified EHR that empowers doctors to define their clinical workflows and treatment plans to enhance patient outcomes.
  3. Dicom Systems DCMSYS CloudVNA – DCMSYS Vendor Neutral Archive offers a cost-effective means of consolidating disparate imaging systems into a single repository, while providing enterprise-wide access and archiving of all medical images and other medical records.

Life Sciences

  1. National Instruments LabVIEW – Graphical system design software that provides scientists and engineers with the tools needed to create and deploy measurement and control systems through simple yet powerful networks.
  2. NCBI Blast – Analysis tools and datasets that allow users to perform flexible sequence similarity searches.
  3. Acellera AceCloud – Innovative tools and technologies for the study of biophysical phenomena. Acellera leverages the power of AWS Cloud to enable molecular dynamics simulations.

Healthcare and life sciences companies deal with huge amounts of data, and many of their data sets are some of the most complex in the world. From physicians and nurses to researchers and analysts, these users are typically hampered by their current systems. Their legacy software cannot let them efficiently store or effectively make use of the immense amounts of data they work with. And protracted and complex software purchasing cycles keep them from innovating at speed to stay ahead of market and industry trends. Data analytics and business intelligence solutions in AWS Marketplace offer specialized support for these industries, including:

  • Tableau Server – Enable teams to visualize across costs, needs, and outcomes at once to make the most of resources. The solution helps hospitals identify the impact of evidence-based medicine, wellness programs, and patient engagement.
  • TIBCO Spotfire and JasperSoft. TIBCO provides technical teams powerful data visualization, data analytics, and predictive analytics for Amazon Redshift, Amazon RDS, and popular database sources via AWS Marketplace.
  • Qlik Sense Enterprise. Qlik enables healthcare organizations to explore clinical, financial and operational data through visual analytics to discover insights which lead to improvements in care, reduced costs and delivering higher value to patients.

With more than 5,000 listings across more than 35 categories, AWS Marketplace simplifies software licensing and procurement by enabling customers to accept user agreements, choose pricing options, and automate the deployment of software and associated AWS resources with just a few clicks. AWS Marketplace also simplifies billing for customers by delivering a single invoice detailing business software and AWS resource usage on a monthly basis.

With AWS Marketplace, we can help drive operational efficiencies and reduce costs in these ways:

  • Easily bring in new solutions to solve increasingly complex issues, gain quick insight into the huge amounts of data users handle.
  • Healthcare data will be more actionable. We offer pay-as-you-go solutions that make it considerably easier and more cost-effective to ingest, store, analyze, and disseminate data.
  • Deploy healthcare and life sciences software with 1-Click ease — then evaluate and deploy it in minutes. Users can now speed up their historically slow cycles in software procurement and implementation.
  • Pay only for what’s consumed — and manage software costs on your AWS bill.
  • In addition to the already secure AWS Cloud, AWS Marketplace offers industry-leading solutions to help you secure operating systems, platforms, applications and data that can integrate with existing controls in your AWS Cloud and hybrid environment.

Click here to see who the current list of vendors are in our new Healthcare & Life Sciences category.

Come on In
If you are a healthcare ISV and would like to list and sell your products on AWS, visit our Sell in AWS Marketplace page.

– Wilson To and Luis Daniel Soto