Tag Archives: Uncategorized

Tracking People by their MAC Addresses

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/09/tracking-people-by-their-mac-addresses.html

Yet another article on the privacy risks of static MAC addresses and always-on Bluetooth connections. This one is about wireless headphones.

The good news is that product vendors are fixing this:

Several of the headphones which could be tracked over time are for sale in electronics stores, but according to two of the manufacturers NRK have spoken to, these models are being phased out.

“The products in your line-up, Elite Active 65t, Elite 65e and Evolve 75e, will be going out of production before long and newer versions have already been launched with randomized MAC addresses. We have a lot of focus on privacy by design and we continuously work with the available security measures on the market,” head of PR at Jabra, Claus Fonnesbech says.

“To run Bluetooth Classic we, and all other vendors, are required to have static addresses and you will find that in older products,” Fonnesbech says.

Jens Bjørnkjær Gamborg, head of communications at Bang & Olufsen, says that “this is products that were launched several years ago.”

“All products launched after 2019 randomize their MAC-addresses on a frequent basis as it has become the market standard to do so,” Gamborg says.

EDITED TO ADD (9/13): It’s not enough to randomly change MAC addresses. Any other plaintext identifiers need to be changed at the same time.

Zero-Click iPhone Exploits

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/09/zero-click-iphone-exploits.html

Citizen Lab is reporting on two zero-click iMessage exploits, in spyware sold by the cyberweapons arms manufacturer NSO Group to the Bahraini government.

These are particularly scary exploits, since they don’t require to victim to do anything, like click on a link or open a file. The victim receives a text message, and then they are hacked.

More on this here.

Building well-architected serverless applications: Optimizing application performance – part 3

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-optimizing-application-performance-part-3/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

PERF 1. Optimizing your serverless application’s performance

This post continues part 2 of this security question. Previously, I look at designing your function to take advantage of concurrency via asynchronous and stream-based invocations. I cover measuring, evaluating, and selecting optimal capacity units.

Best practice: Integrate with managed services directly over functions when possible

Consider using native integrations between managed services as opposed to AWS Lambda functions when no custom logic or data transformation is required. This can enable optimal performance, requires less resources to manage, and increases security. There are also a number of AWS application integration services that enable communication between decoupled components with microservices.

Use native cloud services integration

When using Amazon API Gateway APIs, you can use the AWS integration type to connect to other AWS services natively. With this integration type, API Gateway uses Apache Velocity Template Language (VTL) and HTTPS to directly integrate with other AWS services.

Timeouts and errors must be managed by the API consumer. For more information on using VTL, see “Amazon API Gateway Apache Velocity Template Reference”. For an example application that uses API Gateway to read and write directly to/from Amazon DynamoDB, see “Building a serverless URL shortener app without AWS Lambda”.

API Gateway direct service integration

API Gateway direct service integration

There is also a tutorial available, Build an API Gateway REST API with AWS integration.

When using AWS AppSync, you can use VTL, direct integration with Amazon Aurora, Amazon Elasticsearch Service, and any publicly available HTTP endpoint. AWS AppSync can use multiple integration types and can maximize throughput at the data field level. For example, you can run full-text searches on the orderDescription field against Elasticsearch while fetching the remaining data from DynamoDB. For more information, see the AWS AppSync resolver tutorials.

In the serverless airline example used in this series, the catalog service uses AWS AppSync to provide a GraphQL API for searching flights. AWS AppSync uses DynamoDB as a database, and all compute logic is contained in the Apache Velocity Template (VTL).

Serverless airline catalog service using VTL

Serverless airline catalog service using VTL

AWS Step Functions integrates with multiple AWS services using service Integrations. For example, this allows you to fetch and put data into DynamoDB, or run an AWS Batch job. You can also publish messages to Amazon Simple Notification Service (SNS) topics, and send messages to Amazon Simple Queue Service (SQS) queues. For more details on the available integrations, see “Using AWS Step Functions with other services”.

Using Amazon EventBridge, you can connect your applications with data from a variety of sources. You can connect to various AWS services natively, and act as an event bus across multiple AWS accounts to ease integration. You can also use the API destination feature to route events to services outside of AWS. EventBridge handles the authentication, retries, and throughput. For more details on available EventBridge targets, see the documentation.

Amazon EventBridge

Amazon EventBridge

Good practice: Optimize access patterns and apply caching where applicable

Consider caching when clients may not require up to date data. Optimize access patterns to only fetch data that is necessary to end users. This improves the overall responsiveness of your workload and makes more efficient use of compute and data resources across components.

Implement caching for suitable access patterns

For REST APIs, you can use API Gateway caching to reduce the number of calls made to your endpoint and also improve the latency of requests to your API. When you enable caching for a stage or method, API Gateway caches responses for a specified time-to-live (TTL) period. API Gateway then responds to the request by looking up the endpoint response from the cache, instead of making a request to your endpoint.

API Gateway caching

API Gateway caching

For more information, see “Enabling API caching to enhance responsiveness”.

For geographically distributed clients, Amazon CloudFront or your third-party CDN can cache results at the edge and further reducing network round-trip latency.

For GraphQL APIs, AWS AppSync provides built-in server-side caching at the API level. This reduces the need to access data sources directly by making data available in a high-speed in-memory cache. This improves performance and decreases latency. For queries with common arguments or a restricted set of arguments, you can also enable caching at the resolver level to improve overall responsiveness. For more information, see “Improving GraphQL API performance and consistency with AWS AppSync Caching”.

When using databases, cache results and only connect to and fetch data when needed. This reduces the load on the downstream database and improves performance. Include a caching expiration mechanism to prevent serving stale records. For more information on caching implementation patterns and considerations, see “Caching Best Practices”.

For DynamoDB, you can enable caching with Amazon DynamoDB Accelerator (DAX). DAX enables you to benefit from fast in-memory read performance in microseconds, rather than milliseconds. DAX is suitable for use cases that may not require strongly consistent reads. Some examples include real-time bidding, social gaming, and trading applications. For more information, read “Use cases for DAX“.

For general caching purposes, Amazon ElastiCache provides a distributed in-memory data store or cache environment. ElastiCache supports a variety of caching patterns through key-value stores using the Redis and Memcache engines. Define what is safe to cache, even when using popular caching patterns like lazy caching or write-through. Set a TTL and eviction policy that fits your baseline performance and access patterns. This ensures that you don’t serve stale records or cache data that should have a strongly consistent read. For more information on ElastiCache caching and time-to-live strategies, see the documentation.

For additional serverless caching suggestions, see the AWS Serverless Hero blog post “All you need to know about caching for serverless applications”.

Reduce overfetching and underfetching

Over-fetching is when a client downloads too much data from a database or endpoint. This results in data in the response that you don’t use. Under-fetching is not having enough data in the response. The client then needs to make additional requests to receive the data. Overfetching and underfetching can both affect performance.

To fetch a collection of items from a DynamoDB table, you can perform a query or a scan. A scan operation always scans the entire table or secondary index. It then filters out values to provide the result you want, essentially adding the extra step of removing data from the result set. A query operation finds items directly based on primary key values.

For faster response times, design your tables and indexes so that your applications can use query instead of scan. Use both Global Secondary Index (GSI) in addition to composite sort keys to help you query hierarchical relationships in your data. For more information, see “Best Practices for Querying and Scanning Data”.

Consider GraphQL and AWS AppSync for interactive web applications, mobile, real-time, or for use cases where data drives the user interface. AWS AppSync provides data fetching flexibility, which allows your client to query only for the data it needs, in the format it needs it. Ensure you do not make too many nested queries where a long response may result in timeouts. GraphQL helps you adapt access patterns as your workload evolves. This makes it more flexible as it allows you to move to purpose-built databases if necessary.

Compress payload and data storage

Some AWS services allow you to compress the payload or compress data storage. This can improve performance by sending and receiving less data, and can save on data storage, which can also reduce costs.

If your content supports deflate, gzip or identity content encoding, API Gateway allows your client to call your API with compressed payloads. By default, API Gateway supports decompression of the method request payload. However, you must configure your API to enable compression of the method response payload. Compression in API Gateway and decompression in the client might increase overall latency and require more computing times. Run test cases against your API to determine an optimal value. For more information, see “Enabling payload compression for an API”.

Amazon Kinesis Data Firehose supports compressing streaming data using gzip, snappy, or zip. This minimizes the amount of storage used at the destination. The Amazon Kinesis Data Firehose FAQs has more information on compression. Kinesis Data Firehose also supports converting your streaming data from JSON to Apache Parquet or Apache ORC before storing the data in Amazon S3. Parquet and ORC are columnar data formats that save space and enable faster queries compared to row-oriented formats like JSON.

Conclusion

Evaluate and optimize your serverless application’s performance based on access patterns, scaling mechanisms, and native integrations. You can improve your overall experience and make more efficient use of the platform in terms of both value and resources.

In part 1, I cover measuring and optimizing function startup time. I explain cold and warm starts and how to reuse the Lambda execution environment to improve performance. I explain how only importing necessary libraries and dependencies increases application performance.

In part 2, I look at designing your function to take advantage of concurrency via asynchronous and stream-based invocations. I cover measuring, evaluating, and selecting optimal capacity units.

In this post, I look at integrating with managed services directly over functions when possible. I cover optimizing access patterns and applying caching where applicable.

In the next post in the series, I cover the cost optimization pillar from the Well-Architected Serverless Lens.

For more serverless learning resources, visit Serverless Land.

More Military Cryptanalytics, Part III

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/more-military-cryptanalytics-part-iii.html

Late last year, the NSA declassified and released a redacted version of Lambros D. Callimahos’s Military Cryptanalytics, Part III. We just got most of the index. It’s hard to believe that there are any real secrets left in this 44-year-old volume.

Announcing the latest AWS Heroes – August 2021

Post Syndicated from Ross Barich original https://aws.amazon.com/blogs/aws/announcing-the-latest-aws-heroes-august-2021/

AWS Heroes go above and beyond to share knowledge with the community and help others build better and faster on AWS. Last month we launched the AWS Heroes Content Library, a centralized place where Builders can find inspiration and learn from AWS Hero authored educational content including blogs, videos, slide presentations, podcasts, open source projects, and more. As technical communities evolve new Heroes continue to emerge, and each quarter we recognize an outstanding group of individuals from around the world whose impact on community knowledge-sharing is significant and greatly appreciated.

Today we are pleased to introduce the newest AWS Heroes, including the first Heroes based in Cameroon and Malaysia:

Denis Astahov – Vancouver, Canada

Community Hero Denis Astahov is a Solutions Architect at OpsGuru, where he automates and develops various cloud solutions with Infrastructure as Code using Terraform. Denis owns the YouTube channel ADV-IT, where he teaches people about a variety of IT and especially DevOps topics, including AWS, Terraform, Kubernetes, Ansible, Jenkins, Git, Linux, Python, and many others. His channel has more than 70,000 subscribers and over 7,000,000 views, making it one of the most popular free sources for AWS and DevOps knowledge in the Russian speaking community. Denis has more than 10 cloud certifications, including 7 AWS Certifications.

Ivonne Roberts – Tampa, USA

Serverless Hero Ivonne Roberts is a Principal Software Engineer with over fifteen years of software development experience, including ten years working with AWS and more than five years building serverless applications. In recent years, Ivonne has begun sharing that industry knowledge with the greater software engineering community. On her blog ivonneroberts.com and her YouTube channel Serverless DevWidgets, Ivonne focuses on demystifying and removing the hurdles of adopting serverless architecture and on simplifying the software development lifecycle.

Kaushik Mohanraj – Kuala Lumpur, Malaysia

Community Hero Kaushik Mohanraj is a Director at Blazeclan Technologies, Malaysia. An avid cloud practitioner, Kaushik has experience in the evaluation of well-architected solutions and is an ambassador for cloud technologies and digital transformation. Kaushik holds 10 active AWS Certifications, which help him to provide relevant and optimal solutions. Kaushik is keen to build a community he thrives in and hence joined AWS User Group Malaysia as a co-organizer in 2019. He is also the co-director of Women in Big Data – Malaysia Chapter, with an aim to build and provide a platform for women in technology.

Luc van Donkersgoed – Utrecht, The Netherlands

DevTools Hero Luc van Donkersgoed is a geek at heart, solutions architect, software developer, and entrepreneur. He is fascinated by bleeding edge technology. When he is not designing and building powerful applications on AWS, you can probably find Luc sharing knowledge in blogs, articles, videos, conferences, training sessions, and Twitter. He has authored a 16-session AWS Solutions Architect Professional course, presented on various topics including how the AWS CDK will enable a new generation of serverless developers, appeared on the AWS Developers Podcast, and he maintains the AWS Blogs Twitter Bot.

Rick Hwang – Taipei City, Taiwan

Community Hero Rick Hwang is a cloud and infrastructure architect at 91APP in Taiwan. His passion to educate developers has been demonstrated both internally as an annual AWS training project leader, and externally as a community owner of SRE Taiwan. Rick started SRE Taiwan on his own and has recruited over 3,600 members over the past 4 years via peer-to-peer interactions, constantly sharing content, and hosting annual study group meetups. Rick enjoys helping people increase their understanding of AWS and the cloud in general.

Rosius Ndimofor – Douala, Cameroon

Serverless Hero Rosius Ndimofor is a software developer at Serverless Guru. He has been building desktop, web, and mobile apps for various customers for 8 years. In 2020, Rosius was introduced to AWS by his friend, was immediately hooked, and started learning as much as he could about building AWS serverless applications. You can find Rosius speaking at local monthly AWS meetup events, or his forte: building serverless web or mobile applications and documenting the entire process on his blog.

Setia Budi – Bandung, Indonesia

Community Hero Setia Budi is an academic from Indonesia. He runs a YouTube channel named Indonesia Belajar, which provides learning materials related to computer science and cloud computing (delivered in Indonesian language). His passion for the AWS community is also expressed by delivering talks in AWS DevAx Connect, and he is actively building a range of learning materials related to AWS services, and streaming weekly live sessions featuring experts from AWS to talk about cloud computing.

Vinicius Caridá – São Paulo, Brazil

Machine Learning Hero Vinicius Caridá (Vini) is a Computer Engineer who believes tech, data, & AI can impact people for a fairer and more evolved world. He loves to share his knowledge on AI, NLP, and MLOps on social media, on his YouTube channel, and at various meetups such as AWS User Group São Paulo where he is a community leader. Vini is also a community leader at TensorFlow São Paulo, an open source machine learning framework. He regularly participates in conferences and writes articles for different audiences (academic, scientific, technical), and different maturity levels (beginner, intermediate, and advanced).

 

 

 

 

If you’d like to learn more about the new Heroes, or connect with a Hero near you, please visit the AWS Heroes website or browse the AWS Heroes Content Library.

Ross;

Interesting Privilege Escalation Vulnerability

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/interesting-privilege-escalation-vulnerability.html

If you plug a Razer peripheral (mouse or keyboard, I think) into a Windows 10 or 11 machine, you can use a vulnerability in the Razer Synapse software — which automatically downloads — to gain SYSTEM privileges.

It should be noted that this is a local privilege escalation (LPE) vulnerability, which means that you need to have a Razer devices and physical access to a computer. With that said, the bug is so easy to exploit as you just need to spend $20 on Amazon for Razer mouse and plug it into Windows 10 to become an admin.

15 years of silicon innovation with Amazon EC2

Post Syndicated from Neelay Thaker original https://aws.amazon.com/blogs/compute/15-years-of-silicon-innovation-with-amazon-ec2/

The Graviton Hackathon is now available to developers globally to build and migrate apps to run on AWS Graviton2

This week we are celebrating 15 years of Amazon EC2 live on Twitch August 23rd – 24th with keynotes and sessions from AWS leaders and experts. You can watch all the sessions on-demand later this week to learn more about innovation at Amazon EC2.

As Jeff Barr noted in his blog, EC2 started as a single instance back in 2006 with the idea of providing developers on demand access to compute infrastructure and have them pay only for what they use. Today, EC2 has over 400 instance types with the broadest and deepest portfolio of instances in the cloud. As we strive to be the best cloud for every workload, customers consistently ask us for higher performance, and lower costs for their workloads. One way we deliver on this is by building silicon specifically optimized for the cloud. Our journey with custom silicon started in 2012 when we began looking into ways to improve the performance of EC2 instances by offloading virtualization functionality that is traditionally run on underlying servers to a dedicated offload card. Today, the AWS Nitro System is the foundation upon which our modern EC2 instances are built, delivering better performance and enhanced security. The Nitro System has also enabled us to innovate faster and deliver new instances and unique capabilities including Mac Instances, AWS Outposts, AWS Wavelength, VMware Cloud on AWS, and AWS Nitro Enclaves. We also have strong collaboration with partners to build custom silicon optimized for the AWS cloud with their latest generation processors to continue to deliver better performance and better price performance for our joint customers. Additionally, we’ve also designed AWS Inferentia and AWS Trainium chips to drive down the cost and boost performance for deep learning workloads.

One of the biggest innovations that help us deliver higher performance at lower cost for customer workloads are the AWS Graviton2 processors, which are the second generation of Arm-based processors custom-built by AWS. Instances powered by the latest generation AWS Graviton2 processors deliver up to 40% better performance at 20% lower per-instance cost over comparable x86-based instances in EC2. Additionally, Graviton2 is our most power efficient processor. In fact, Graviton2 delivers 2 to 3.5 times better performance per Watt of energy use versus any other processor in AWS.

Customers from startups to large enterprises including Intuit, Snap, Lyft, SmugMug, and Nextroll have realized significant price performance benefits for their production workloads on AWS Graviton2-based instances. Recently, EPIC Games added support for Graviton2 in Unreal Engine to help its developers build high performance games. What’s even more interesting is that AWS Graviton2-based instances supported 12 core retail services during Amazon Prime Day this year.

Most customers get started with AWS Graviton2-based instances by identifying and moving one or two workloads that are easy to migrate, and after realizing the price performance benefits, they move more workloads to Graviton2. In her blog, Liz Fong-Jones, Principal Developer Advocate at Honeycomb.io, details her journey of adopting Graviton2 and realizing significant price performance improvements. Using experience from working with thousands of customers like Liz who have adopted Graviton2, we built a program called the Graviton Challenge that provides a step-by-step plan to help you move your first workload to Graviton2-based instances.

Today, to further incentivize developers to get started with Graviton2, we are launching the Graviton Hackathon, where you can build a new app or migrate an app to run on Graviton2-based instances. Whether you are an existing EC2 user looking to optimize price performance for your workload, an Arm developer looking to leverage Arm-based instances in the cloud, or an open source developer adding support for Arm, you can participate in the Graviton Hackathon for a chance to win prizes for your project, including up to $10,000 in prize money. We look forward to the new applications which will be able to take advantage of the price performance benefits of Graviton. To learn more about Graviton2, watch the EC2 15th Birthday event sessions on-demand later this week, register to the attend the Graviton workshop at the upcoming AWS Summit Online, or register for the Graviton Challenge.

Cloud computing has made cutting edge, cost effective infrastructure available to everyday developers. Startups can use a credit card to spin up instances in minutes and scale up and scale down easily based on demand. Enterprises can leverage compute infrastructure and services to drive improved operational efficiency and customer experience. The last 15 years of EC2 innovation have been at the forefront of this shift, and we are looking forward to the next 15 years.

Surveillance of the Internet Backbone

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/surveillance-of-the-internet-backbone.html

Vice has an article about how data brokers sell access to the Internet backbone. This is netflow data. It’s useful for cybersecurity forensics, but can also be used for things like tracing VPN activity.

At a high level, netflow data creates a picture of traffic flow and volume across a network. It can show which server communicated with another, information that may ordinarily only be available to the server owner or the ISP carrying the traffic. Crucially, this data can be used for, among other things, tracking traffic through virtual private networks, which are used to mask where someone is connecting to a server from, and by extension, their approximate physical location.

In the hands of some governments, that could be dangerous.

How MEDHOST’s cardiac risk prediction successfully leveraged AWS analytic services

Post Syndicated from Pandian Velayutham original https://aws.amazon.com/blogs/big-data/how-medhosts-cardiac-risk-prediction-successfully-leveraged-aws-analytic-services/

MEDHOST has been providing products and services to healthcare facilities of all types and sizes for over 35 years. Today, more than 1,000 healthcare facilities are partnering with MEDHOST and enhancing their patient care and operational excellence with its integrated clinical and financial EHR solutions. MEDHOST also offers a comprehensive Emergency Department Information System with business and reporting tools. Since 2013, MEDHOST’s cloud solutions have been utilizing Amazon Web Services (AWS) infrastructure, data source, and computing power to solve complex healthcare business cases.

MEDHOST can utilize the data available in the cloud to provide value-added solutions for hospitals solving complex problems, like predicting sepsis, cardiac risk, and length of stay (LOS) as well as reducing re-admission rates. This requires a solid foundation of data lake and elastic data pipeline to keep up with multi-terabyte data from thousands of hospitals. MEDHOST has invested a significant amount of time evaluating numerous vendors to determine the best solution for its data needs. Ultimately, MEDHOST designed and implemented machine learning/artificial intelligence capabilities by leveraging AWS Data Lab and an end-to-end data lake platform that enables a variety of use cases such as data warehousing for analytics and reporting.

Since you’re reading this post, you may also be interested in the following:

Getting started

MEDHOST’s initial objectives in evaluating vendors were to:

  • Build a low-cost data lake solution to provide cardiac risk prediction for patients based on health records
  • Provide an analytical solution for hospital staff to improve operational efficiency
  • Implement a proof of concept to extend to other machine learning/artificial intelligence solutions

The AWS team proposed AWS Data Lab to architect, develop, and test a solution to meet these objectives. The collaborative relationship between AWS and MEDHOST, AWS’s continuous innovation, excellent support, and technical solution architects helped MEDHOST select AWS over other vendors and products. AWS Data Lab’s well-structured engagement helped MEDHOST define clear, measurable success criteria that drove the implementation of the cardiac risk prediction and analytical solution platform. The MEDHOST team consisted of architects, builders, and subject matter experts (SMEs). By connecting MEDHOST experts directly to AWS technical experts, the MEDHOST team gained a quick understanding of industry best practices and available services allowing MEDHOST team to achieve most of the success criteria at the end of a four-day design session. MEDHOST is now in the process of moving this work from its lower to upper environment to make the solution available for its customers.

Solution

For this solution, MEDHOST and AWS built a layered pipeline consisting of ingestion, processing, storage, analytics, machine learning, and reinforcement components. The following diagram illustrates the Proof of Concept (POC) that was implemented during the four-day AWS Data Lab engagement.

Ingestion layer

The ingestion layer is responsible for moving data from hospital production databases to the landing zone of the pipeline.

The hospital data was stored in an Amazon RDS for PostgreSQL instance and moved to the landing zone of the data lake using AWS Database Migration Service (DMS). DMS made migrating databases to the cloud simple and secure. Using its ongoing replication feature, MEDHOST and AWS implemented change data capture (CDC) quickly and efficiently so MEDHOST team could spend more time focusing on the most interesting parts of the pipeline.

Processing layer

The processing layer was responsible for performing extract, tranform, load (ETL) on the data to curate them for subsequent uses.

MEDHOST used AWS Glue within its data pipeline for crawling its data layers and performing ETL tasks. The hospital data copied from RDS to Amazon S3 was cleaned, curated, enriched, denormalized, and stored in parquet format to act as the heart of the MEDHOST data lake and a single source of truth to serve any further data needs. During the four-day Data Lab, MEDHOST and AWS targeted two needs: powering MEDHOST’s data warehouse used for analytics and feeding training data to the machine learning prediction model. Even though there were multiple challenges, data curation is a critical task which requires an SME. AWS Glue’s serverless nature, along with the SME’s support during the Data Lab, made developing the required transformations cost efficient and uncomplicated. Scaling and cluster management was addressed by the service, which allowed the developers to focus on cleaning data coming from homogenous hospital sources and translating the business logic to code.

Storage layer

The storage layer provided low-cost, secure, and efficient storage infrastructure.

MEDHOST used Amazon S3 as a core component of its data lake. AWS DMS migration tasks saved data to S3 in .CSV format. Crawling data with AWS Glue made this landing zone data queryable and available for further processing. The initial AWS Glue ETL job stored the parquet formatted data to the data lake and its curated zone bucket. MEDHOST also used S3 to store the .CSV formatted data set that will be used to train, test, and validate its machine learning prediction model.

Analytics layer

The analytics layer gave MEDHOST pipeline reporting and dashboarding capabilities.

The data was in parquet format and partitioned in the curation zone bucket populated by the processing layer. This made querying with Amazon Athena or Amazon Redshift Spectrum fast and cost efficient.

From the Amazon Redshift cluster, MEDHOST created external tables that were used as staging tables for MEDHOST data warehouse and implemented an UPSERT logic to merge new data in its production tables. To showcase the reporting potential that was unlocked by the MEDHOST analytics layer, a connection was made to the Redshift cluster to Amazon QuickSight. Within minutes MEDHOST was able to create interactive analytics dashboards with filtering and drill-down capabilities such as a chart that showed the number of confirmed disease cases per US state.

Machine learning layer

The machine learning layer used MEDHOST’s existing data sets to train its cardiac risk prediction model and make it accessible via an endpoint.

Before getting into Data Lab, the MEDHOST team was not intimately familiar with machine learning. AWS Data Lab architects helped MEDHOST quickly understand concepts of machine learning and select a model appropriate for its use case. MEDHOST selected XGBoost as its model since cardiac prediction falls within regression technique. MEDHOST’s well architected data lake enabled it to quickly generate training, testing, and validation data sets using AWS Glue.

Amazon SageMaker abstracted underlying complexity of setting infrastructure for machine learning. With few clicks, MEDHOST started Jupyter notebook and coded the components leading to fitting and deploying its machine learning prediction model. Finally, MEDHOST created the endpoint for the model and ran REST calls to validate the endpoint and trained model. As a result, MEDHOST achieved the goal of predicting cardiac risk. Additionally, with Amazon QuickSight’s SageMaker integration, AWS made it easy to use SageMaker models directly in visualizations. QuickSight can call the model’s endpoint, send the input data to it, and put the inference results into the existing QuickSight data sets. This capability made it easy to display the results of the models directly in the dashboards. Read more about QuickSight’s SageMaker integration here.

Reinforcement layer

Finally, the reinforcement layer guaranteed that the results of the MEDHOST model were captured and processed to improve performance of the model.

The MEDHOST team went beyond the original goal and created an inference microservice to interact with the endpoint for prediction, enabled abstracting of the machine learning endpoint with the well-defined domain REST endpoint, and added a standard security layer to the MEDHOST application.

When there is a real-time call from the facility, the inference microservice gets inference from the SageMaker endpoint. Records containing input and inference data are fed to the data pipeline again. MEDHOST used Amazon Kinesis Data Streams to push records in real time. However, since retraining the machine learning model does not need to happen in real time, the Amazon Kinesis Data Firehose enabled MEDHOST to micro-batch records and efficiently save them to the landing zone bucket so that the data could be reprocessed.

Conclusion

Collaborating with AWS Data Lab enabled MEDHOST to:

  • Store single source of truth with low-cost storage solution (data lake)
  • Complete data pipeline for a low-cost data analytics solution
  • Create an almost production-ready code for cardiac risk prediction

The MEDHOST team learned many concepts related to data analytics and machine learning within four days. AWS Data Lab truly helped MEDHOST deliver results in an accelerated manner.


About the Authors

Pandian Velayutham is the Director of Engineering at MEDHOST. His team is responsible for delivering cloud solutions, integration and interoperability, and business analytics solutions. MEDHOST utilizes modern technology stack to provide innovative solutions to our customers. Pandian Velayutham is a technology evangelist and public cloud technology speaker.

 

 

 

 

George Komninos is a Data Lab Solutions Architect at AWS. He helps customers convert their ideas to a production-ready data product. Before AWS, he spent 3 years at Alexa Information domain as a data engineer. Outside of work, George is a football fan and supports the greatest team in the world, Olympiacos Piraeus.

More on Apple’s iPhone Backdoor

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/more-on-apples-iphone-backdoor.html

In this post, I’ll collect links on Apple’s iPhone backdoor for scanning CSAM images. Previous links are here and here.

Apple says that hash collisions in its CSAM detection system were expected, and not a concern. I’m not convinced that this secondary system was originally part of the design, since it wasn’t discussed in the original specification.

Good op-ed from a group of Princeton researchers who developed a similar system:

Our system could be easily repurposed for surveillance and censorship. The design wasn’t restricted to a specific category of content; a service could simply swap in any content-matching database, and the person using that service would be none the wiser.

EDITED TO ADD (8/30): Good essays by Matthew Green and Alex Stamos, Ross Anderson, Edward Snowden, and Susan Landau. And also Kurt Opsahl.

T-Mobile Data Breach

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/t-mobile-data-breach.html

It’s a big one:

As first reported by Motherboard on Sunday, someone on the dark web claims to have obtained the data of 100 million from T-Mobile’s servers and is selling a portion of it on an underground forum for 6 bitcoin, about $280,000. The trove includes not only names, phone numbers, and physical addresses but also more sensitive data like social security numbers, driver’s license information, and IMEI numbers, unique identifiers tied to each mobile device. Motherboard confirmed that samples of the data “contained accurate information on T-Mobile customers.”

Apple’s NeuralHash Algorithm Has Been Reverse-Engineered

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/apples-neuralhash-algorithm-has-been-reverse-engineered.html

Apple’s NeuralHash algorithm — the one it’s using for client-side scanning on the iPhone — has been reverse-engineered.

Turns out it was already in iOS 14.3, and someone noticed:

Early tests show that it can tolerate image resizing and compression, but not cropping or rotations.

We also have the first collision: two images that hash to the same value.

The next step is to generate innocuous images that NeuralHash classifies as prohibited content.

This was a bad idea from the start, and Apple never seemed to consider the adversarial context of the system as a whole, and not just the cryptography.