Introducing logs from the dashboard for Cloudflare Workers

If you’re writing code: what can go wrong, will go wrong.

Many developers know the feeling: “It worked in the local testing suite, it worked in our staging environment, but… it’s broken in production?” Testing can reduce mistakes and debugging can help find them, but logs give us the tools to understand and improve what we are creating.

if (this === undefined) {
  console.log("there’s no way… right?") // Narrator: there was.
}

While logging can help you understand when the seemingly impossible is actually possible, it’s something that no developer really wants to set up or maintain on their own. That’s why we’re excited to launch a new addition to the Cloudflare Workers platform: logs and exceptions from the dashboard.

Starting today, you can view and filter the console.log output and exceptions from a Worker… at no additional cost with no configuration needed!

View logs, just a click away

When you view a Worker in the dashboard, you’ll now see a “Logs” tab which you can click on to view a detailed stream of logs and exceptions. Here’s what it looks like in action:

Each log entry contains an event with a list of logs, exceptions, and request headers if it was triggered by an HTTP request. We also automatically redact sensitive URLs and headers such as Authorization, Cookie, or anything else that appears to have a sensitive name.

If you are in the Durable Objects open beta, you will also be able to view the logs and requests sent to each Durable Object. This is a great tool to help you understand and debug the interactions between your Worker and a Durable Object.

For now, we support filtering by event status and type. Though, you can expect more filters to be added to the dashboard very soon! Today, we support advanced filtering with the wrangler CLI, which will be discussed later in this blog.

console.log(), and you’re all set

It’s really simple to get started with logging for Workers. Simply invoke one of the standard console APIs, such as console.log(), and we handle the rest. That’s it! There’s no extra setup, no configuration needed, and no hidden logging fees.

function logRequest (request) {
  const { cf, headers } = request
  const { city, region, country, colo, clientTcpRtt  } = cf
  
  console.log("Detected location:", [city, region, country].filter(Boolean).join(", "))
  if (clientTcpRtt) {
     console.debug("Round-trip time from client to", colo, "is", clientTcpRtt, "ms")
  }

  // You can also pass an object, which will be interpreted as JSON.
  // This is great if you want to define your own structured log schema.
  console.log({ headers })
}

In fact, you don’t even need to use console.log to view an event from the dashboard. If your Worker doesn’t generate any logs or exceptions, you will still be able to see the request headers from the event.

Advanced filters, from your terminal

If you need more advanced filters you can use wrangler, our command-line tool for deploying Workers. We’ve updated the wrangler tail command to support sampling and a new set of advanced filters. You also no longer need to install or configure cloudflared to use the command. Not to mention it’s much faster, no more waiting around for logs to appear. Here are a few examples:

# Filter by your own IP address, and if there was an uncaught exception.
wrangler tail --format=pretty --ip-address=self --status=error

# Filter by HTTP method, then apply a 10% sampling rate.
wrangler tail --format=pretty --method=GET --sampling-rate=0.1

# Filter using a generic search query.
wrangler tail --format=pretty --search="TypeError"

We recommend using the “pretty” format, since wrangler will output your logs in a colored, human-readable format. (We’re also working on a similar display for the dashboard.)

However, if you want to access structured logs, you can use the “json” format. This is great if you want to pipe your logs to another tool, such as jq, or save them to a file. Here are a few more examples:

# Parses each log event, but only outputs the url.
wrangler tail --format=json | jq .event.request?.url

# You can also specify --once to disconnect the tail after receiving the first log.
# This is useful if you want to run tests in a CI/CD environment.
wrangler tail --format=json --once > event.json

Try it out!

Both logs from the dashboard and wrangler tail are available and free for existing Workers customers. If you would like more information or a step-by-step guide, check out any of the resources below.

Go to the dashboard and look at some logs!
Read the getting started guide for logging.
Look at the tail logs API reference.

Monitor, Evaluate, and Demonstrate Backup Compliance with AWS Backup Audit Manager

2021-08-24 Steve Roberts

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/monitor-evaluate-and-demonstrate-backup-compliance-with-aws-backup-audit-manager/

Today, I’m happy to announce the availability of AWS Backup Audit Manager, a new feature of AWS Backup that helps you monitor and evaluate the compliance status of your backups to meet business and regulatory requirements, and enables you to generate reports that help demonstrate compliance to auditors and regulators.

AWS Backup is a fully managed service that provides the ability to initiate policy-driven backups and restores of AWS applications, simplifying the process of protecting data at scale by removing the need for custom scripts and manual processes. However, customers still needed to use their own tooling for verifying that backup policies were being enforced and, as part of proving adherence to auditors, parsing backup transcripts to convert them into auditable reports.

With AWS Backup Audit Manager, you can now continuously and automatically track your backup activity, such as changes to a backup plan or backup vault, and generate automatic daily reports. AWS Backup Audit Manager provides built-in, customizable, compliance controls. Simply put, controls are procedures with backup policy parameters, for example the backup frequency or the retention period, that align with your business compliance and regulatory requirements.

You create a framework, scoped to an account and Region, and add the controls you need to it. Backup activities are tracked against the controls, automatically detecting violations of your defined data protection policies, enabling you to take quick corrective actions. To enable tracking of backup activities, AWS Backup Audit Manager requires you to enable monitoring through AWS Config for your backup plans (AWS::Backup::BackupPlan resource type), backup selection (AWS::Backup::BackupSelection), vaults (AWS::Backup::BackupVault), recovery points (AWS::Backup::RecoveryPoint), and AWS Config resource compliance (AWS::Config::ResourceCompliance). You can check the recording status of these resources in the AWS Backup console, using the Resource Tracking section of the Frameworks page.

Once you’ve added the controls you need to your framework, you deploy it. If you have different internal or regulatory standards to meet, you can create and deploy additional frameworks of controls. Once the framework is deployed, you can set up automatic daily reports of your backup activity. These are displayed in a dashboard, and you can also request on-demand reports at any time. You can also import your findings into AWS Audit Manager, a service I wrote about during AWS re:Invent 2020 in this news blog post.

This short video gives a brief overview of the new AWS Backup Audit Manager feature.

Available Controls and Backup Reports
AWS Backup Audit Manager provides five backup governance control templates and backup activity reporting on your backup jobs, copy jobs, and restore jobs. These reports improve visibility into backup activities for a single account and Region, helping you monitor your operational posture and identify failures that may need further action.

When creating a framework, you provide a name, an optional description, and you select whether to use the provided AWS Backup framework type, which includes five pre-defined controls, or you can opt to customize your framework.

Choosing Custom framework expands the panel to show the available controls, their parameters, and the option to include or exclude them from your framework. The five available controls are titled Backup resources protected by backup plan, Backup plan minimum frequency and minimum retention, Backup prevent recovery point manual deletion, Backup recovery point encrypted, and Backup recovery point minimum retention. To the right of each control’s title you’ll find an info link that describes what the control evaluates, how frequently, and what it means for a resource to be compliant with the control.

Let’s examine a couple of controls. The Backup resources protected by backup plan control enables you to select all supported resources, or those identified by a tag, by type, or a particular resource. This control helps identify gaps in your backup coverage.

The Backup plan minimum frequency and minimum retention control has parameters governing how frequently the backup plan should be taking backups, and for how long recovery points should be maintained. The default settings require backups to occur every hour, and recovery points should be retained for a month, but you can customize the settings to meet your business compliance requirements.

You complete your selections for the remaining controls, including them and setting appropriate parameter values for your needs, or excluding them from the framework, and then click Create framework to complete the process. The new framework will be created and deployed, which will take a few minutes. If needed, you can go back and edit the controls and parameters in a framework at any time.

Once deployed, the controls in the framework will start to evaluate compliance and you can inspect compliance status in the console by selecting the framework. The summary section reports the overall compliance status of the framework and the number of controls in the framework that are compliant or non-compliant, based on your deployed control definitions.

Below the summary, you’ll find a list containing compliance details for each of the controls in the framework, which can be filtered by status. Each control details whether it’s compliant or non-compliant, and how many resources monitored by the control are non-compliant. Clicking a control title will take you directly to the AWS Config dashboard, where you can view more details on the resources identified by the control.

Automated reports on backup activity can be used to demonstrate compliance to auditors and regulators. To set up reports, first click the Reports entry in the navigation toolbar, and then click Create report plan. You’ll be asked to select a report template.

With the template selected (I chose Backup jobs report), you fill in a name and optional description, choose where in your Amazon Simple Storage Service (Amazon S3) buckets you want the report to be delivered, and the report file formats, and then click Create report plan. Your report will update every 24 hours, and you can run an on-demand report at any time.

Once a report has been run, either automatically or on-demand, you can view the report data by first selecting the report in your Report plans list, followed by clicking View report. You’ll be taken directly to the chosen S3 location of the report files, where you’ll see one object (report) per chosen file type.

Downloading the file shows you the time period in which the resources were evaluated, the backup job details, failure or completion status, status messages, the resource type and backup plan, and more. Here I’ve opened the CSV format file in a spreadsheet.

Open Raven Launch Partnership
With this launch, we’re excited to have Open Raven join us as an AWS Backup partner. Open Raven is a cloud-native data security platform purpose-built for protecting modern data lakes and warehouses. From finding all data locations to proactively identifying exposure, their platform solves a broad spectrum of problems that organizations commonly face when living with large amounts of cloud-based data.

Open Raven Chief Technology Officer Mark Curphey had this to say about the new AWS Backup feature: “To successfully recover from a ransomware attack, organizations need to plan ahead by completing two foundational tasks, identifying critical data and systems and backing them up as per organizational requirements so that they can be protected and recovered. The combination of AWS Backup Audit Manager and Open Raven streamlines this effort, eliminating guesswork and hours of manual toil.”

Start Using AWS Backup Audit Manager Today
AWS Backup Audit Manager is available today in the US East (N. Virginia, Ohio), US West (N. California, Oregon), Canada (Central), EU (Frankfurt, Ireland, London, Paris, Stockholm), South America (Sao Paulo), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo), and Middle East (Bahrain) Regions.

For more information about Backup Audit Manager, refer to this section in the AWS Backup Developer Guide. To get started, visit the AWS Backup console.

— Steve

Cybercriminals Selling Access to Compromised Networks: 3 Surprising Research Findings

2021-08-24 Paul Prudhomme

Post Syndicated from Paul Prudhomme original https://blog.rapid7.com/2021/08/24/cybercriminals-selling-access-to-compromised-networks-3-surprising-research-findings/

Cybercriminals Selling Access to Compromised Networks: 3 Surprising Research Findings

Cybercriminals are innovative, always finding ways to adapt to new circumstances and opportunities. The proof of this can be seen in the rise of a certain variety of activity on the dark web: the sale of access to compromised networks.

This type of dark web activity has existed for decades, but it matured and began to truly thrive amid the COVID-19 global pandemic. The worldwide shift to a remote workforce gave cybercriminals more attack surface to exploit, which fueled sales on underground criminal websites, where buyers and sellers transfer network access to compromised enterprises and organizations to turn a profit.

Having witnessed this sharp rise in breach sales in the cybercriminal ecosystem, IntSights, a Rapid7 company, decided to analyze why and how criminals sell their network access, with an eye toward understanding how to prevent these network compromise events from happening in the first place.

We have compiled our network compromise research, as well as our prevention and mitigation best practices, in the brand-new white paper “Selling Breaches: The Transfer of Enterprise Network Access on Criminal Forums.”

During the process of researching and analyzing, we came across three surprising findings we thought worth highlighting. For a deeper dive, we recommend reading the full white paper, but let’s take a quick look at these discoveries here.

1. The massive gap between average and median breach sales prices

As part of our research, we took a close look at the pricing characteristics of breach sales in the criminal-to-criminal marketplace. Unsurprisingly, pricing varied considerably from one sale to another. A number of factors can influence pricing, including everything from the level of access provided to the value of the victim as a source of criminal revenue.

That said, we found an unexpectedly significant discrepancy between the average price and the median price across the 40 sales we analyzed. The average price came out to approximately $9,640 USD, while the median price was $3,000 USD.

In part, this gap can be attributed to a few unusually high prices among the most expensive offerings. The lowest price in our dataset was $240 USD for access to a healthcare organization in Colombia, but healthcare pricing tends to trend lower than other industries, with a median price of $700 in this sample. On the other end of the spectrum, the highest price was for a telecommunications service provider that came in at about $95,000 USD worth of Bitcoin.

Because of this discrepancy, IntSights researchers view the average price of $9,640 USD as a better indicator of the higher end of the price range, while the median price is more representative of typical pricing for these sales — $3,000 USD was also the single most common price. Nonetheless, it was fascinating to discover this difference and dig into the reasons behind it.

2. The numerical dominance of tech and telecoms victims

While the sales of network access are a cross-industry phenomenon, technology and telecommunications companies are the most common victims. Not only are they frequent targets, but their compromised access also commands some of the highest prices on the market.

In our sample, tech and telecoms represented 10 of the 46 victims, or 22% of those affected by industry. Out of the 10 most expensive offerings we analyzed, four were for tech and telecommunications organizations, and there were only two that had prices under $10,000 USD. A telecommunications service provider located in an unspecified Asian country also had the single most expensive offering in this sample at approximately $95,000 USD.

After investigating the reasoning behind this numerical dominance, IntSights researchers believe that the high value and high number of tech and telecommunications companies as breach victims stem from their usefulness in enabling further attacks on other targets. For example, a cybercriminal who gains access to a mobile service provider could conduct SIM swapping attacks on digital banking customers who use two-factor authentication via SMS.

These pricing standards were surprisingly expensive compared to other industries, but for good reason: the investment may cost more upfront but prove more lucrative in the long run.

3. The low proportion of retail and hospitality victims

As previously mentioned, we broke down the sales of network access based on the industries affected, and to our surprise, only 6.5% of victims were in retail and hospitality. This seemed odd, considering the popularity of the industry as a target for cybercrime. Think of all the headlines in the news about large retail companies falling victim to a breach that exposed millions of customer credentials.

We explored the reasoning behind this low proportion of victims in the space and came to a few conclusions. For example, we theorized that the main customers for these network access sales are ransomware operators, not payment card data collectors. Payment card data collection is likely a more optimal way to monetize access to a retail or hospitality business, whereas putting ransomware on a retail and hospitality network would actually “kill the goose that lays the golden eggs.”

We also found that the second-most expensive offering in this sample was for access to an organization supporting retail and hospitality businesses. The victim was a third party managing customer loyalty and rewards programs, and the seller highlighted how a buyer could monetize this indirect access to its retail and hospitality customer base. This victim may have been more valuable because, among other things, loyalty and rewards programs are softer targets with weaker security than credit cards and bank accounts; thus, they’re easier to defraud.

Learn more about compromised network access sales

Curious to learn more about the how and why of cybercriminals selling compromised network access? Read our white paper, Selling Breaches: The Transfer of Enterprise Network Access on Criminal Forums, for the full story behind this research and how it can inform your security efforts.

Behind the scenes at Atari

2021-08-24 Ashley Whittaker

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/behind-the-scenes-at-atari/

We love Wireframe magazine’s regular feature ‘The principles of game design’. They’re written by video game pioneer Howard Scott Warshaw, who authored several of Atari’s most famous and infamous titles. In the latest issue of Wireframe, he provides a snapshot of the hell-raising that went on behind the scenes at Atari…

Behind the scenes at Atari Wireframe magazine — A moment of relative calm in Atari’s offices, circa the early 1980s. There’s Howard nearest the camera on the right

Video game creation is unusual in that developers need to be focused intently on achieving design goals while simultaneously battling tunnel vision and re-evaluating those goals. It’s a demanding and frustrating predicament. Therefore, a solid video game creator needs two things: a way to let ideas simmer (since rumination is how games grow from mediocre to fabulous) and a way to blow off steam (since frustration abounds while trying to achieve fabulous). At Atari, there was one place where things both simmered and got steamy… the hot tub. The only thing we couldn’t do was keep a lid on the antics cooked up inside.

The hot tub was situated in the two-storey engineering building. This was ironic, because the hot tub generated way more than two stories in that building. The VCS/2600 and Home Computer development groups were upstairs. The first floor held coin-op development, a kitchen/cafeteria, and an extremely well-appointed gym. The gym featured two appendages: a locker area and the hot tub room. Many shenanigans were hatched and/or executed in the hot tub. One from the more epic end of the spectrum comes to mind: the executive birthday surprise.

It was during the birthday celebration of a VP who shall remain nameless, but it might have been the one who used to keep a canister of nitrous oxide and another of pure oxygen in his office. The nitrous oxide was for getting high and laughing some time away, while the oxygen was used for rapid sobering up in the event a spontaneous meeting was called (which happened regularly at Atari). As the party raged on, a small crew of revellers migrated to the small but accommodating hot tub room. Various intoxicants (well beyond the scope of nitrous) were being consumed in celebration of the special event (although by this standard, nearly every day was a special event at Atari).

As the party rolled on, inhibitions were shed along with numerous articles of clothing. At one point, the birthday boy was adjudged to be in dire need of a proper tubbing as he hadn’t lost sufficient layers to keep pace with the party at large. The birthday boy disagreed, and the ensuing negotiation took the form of a lively chase around the area. The VP ran out of the hot tub room and headed for the workout area with a wet posse in hot pursuit, all in varying stages of undress.

It’s important to note here that although refreshments and revelry were widely available at Atari, one item in short supply was conference rooms. Consequently, meetings could pop up in odd locales. Any place an aggregation could be achieved was a potential meeting spot. The sensitivity of the subject matter would determine the level of privacy required on a case-by-case basis. Since people weren’t always working out, the gym had enough places to sit that it could serve as a decent host for gatherings. And as for sensitivity, the hot tub room was well sound-proofed, so intruding ears weren’t a concern.

As the crew of rowdy revellers followed the VP into the workout area, they were confronted by just such a collection of executives who happened to be meeting at the time. I don’t think the birthday party was on the agenda. However, they may have been pleased that the absentee VP had ultimately decided to join their number. It was embarrassing for some, entertaining for others, and nearly career-ending for a couple. The moral of this story being that Atari executives should never go anywhere without their oxygen tanks in tow.

But morals aside, there was work to be done at Atari. In a place where work can lead to antics and antics can lead to work breakthroughs, it’s difficult at times to suss out the precise boundary between work and antics. It takes passion and commitment to pursue side quests productively and yet remain on task when necessary.

The main reason this was a challenge comes down to the fact there are so many distractions constantly going on. Creative people tend to be creative frequently and spontaneously. Also, their creativity is much more motivated by fascination and interest than it is by task lists or project plans. Fun can break out at any moment, and answering the call isn’t always the right choice, no matter how compelling the siren.

Rob Fulop, creator of Missile Command and Demon Attack for the Atari 2600 (among many other hits) isn’t only a great game maker, he’s also a keen observer of human nature. We used to chat about just where the edge is between work and play at Atari. Those who misjudge it can easily fall off the cliff.

Likewise, we explored the concept of what makes a good game designer. Rob said it’s just the right combination of silly and anal. He believed that the people who did well at Atari (and as game makers in general) were the people who could be silly enough to recognise fun, and anal enough to get all the minutia and details aligned correctly in order to deliver the fun. Of course, Rob (being the poet he is) created a wonderful phrasing to describe those with the right stuff. He put it like this: the people who did well at Atari were the people who could goof around as much as possible but still go to heaven.

Get your copy of Wireframe issue 53

You can read more features like this one in Wireframe issue 53, available directly from Raspberry Pi Press — we deliver worldwide.

And if you’d like a handy digital version of the magazine, you can also download issue 53 for free in PDF format.

The post Behind the scenes at Atari appeared first on Raspberry Pi.

Comic for 2021.08.24

2021-08-24 Explosm.net

Post Syndicated from Explosm.net original http://explosm.net/comics/5958/

New Cyanide and Happiness Comic

Convert and Watermark Documents Automatically with Amazon S3 Object Lambda

2021-08-24 Joseph Simon

Post Syndicated from Joseph Simon original https://aws.amazon.com/blogs/architecture/convert-and-watermark-documents-automatically-with-amazon-s3-object-lambda/

When you provide access to a sensitive document to someone outside of your organization, you likely need to ensure that the document is read-only. In this case, your document should be associated with a specific user in case it is shared.

For example, authors often embed user-specific watermarks into their ebooks. This way, if their ebook gets posted to a file-sharing site, they can prevent the purchaser from downloading copies of the ebook in the future.

In this blog post, we provide you a cost-efficient, scalable, and secure solution to efficiently generate user-specific versions of sensitive documents. This solution helps users track who their documents are shared with. This helps prevent fraud and ensure that private information isn’t leaked. Our solution uses a RESTful API, which uses Amazon S3 Object Lambda to convert documents to PDF and apply a watermark based on the requesting user. It also provides a method for authentication and tracks access to the original document.

Architectural overview

S3 Object Lambda processes and transforms data that is requested from Amazon Simple Storage Service (Amazon S3) before it’s sent back to a client. The AWS Lambda function is invoked inline via a standard S3 GET request. It can return different results from the same document based on parameters, such as who is requesting the document. Figure 1 provides a high-level view of the different components that make up the solution.

Figure 1. Document processing architectural diagram

Authenticating users with Amazon Cognito

This architecture defines a RESTful API, but users will likely be using a mobile or web application that calls the API. Thus, the application will first need to authenticate users. We do this via Amazon Cognito, which functions as its own identity provider (IdP). You could also use an external IdP, including those that support OpenID Connect and SAML.

Validating the JSON Web Token with API Gateway

Once the user is successfully authenticated with Amazon Cognito, the application will be sent a JSON Web Token (JWT). This JWT contains information about the user and will be used in subsequent requests to the API.

Now that the application has a token, it will make a request to the API, which is provided by Amazon API Gateway. API Gateway provides a secure, scalable entryway into your application. The API Gateway validates the JWT sent from the client with Amazon Cognito to make sure it is valid. If it is validated, the request is accepted and sent on to the Lambda API Handler. If it’s not, the client gets rejected and sent an error code.

Storing user data with DynamoDB

When the Lambda API Handler receives the request, it parses the JWT to extract the user making the request. It then logs that user, file, and access time into Amazon DynamoDB. Optionally, you may use DynamoDB to store an encoded string that will be used as the watermark, rather than something in plaintext, like user name or email.

Generating the PDF and user-specific watermark

At this point, the Lambda API Handler sends an S3 GET request. However, instead of going to Amazon S3 directly, it goes to a different endpoint that invokes the S3 Object Lambda function. This endpoint is called an S3 Object Lambda Access Point. The S3 GET request contains the original file name and the string that will be used for the watermark.

The S3 Object Lambda function transforms the original file that it downloads from its source S3 bucket. It uses the open-source office suite LibreOffice (and specifically this Lambda layer) to convert the source document to PDF. Once it is converted, a JavaScript library (PDF-Lib) embeds the watermark into the PDF before it’s sent back to the Lambda API Handler function.

The Lambda API Handler stores the converted file in a temporary S3 bucket, generates a presigned URL, and sends that URL back to the client as a 302 redirect. Then the client sends a request to that presigned URL to get the converted file.

To keep the temporary S3 bucket tidy, we use an S3 lifecycle configuration with an expiration policy.

Figure 2. Process workflow for document transformation

Alternate approach

Before S3 Object Lambda was available, Lambda@Edge was used. However, there are three main issues with using Lambda@Edge instead of S3 Object Lambda:

It is designed to run code closer to the end user to decrease latency, but in this case, latency is not a major concern.
It requires using an Amazon CloudFront distribution, and the single-download pattern described here will not take advantage of Lamda@Edge’s caching.
It has quotas on memory that don’t lend themselves to complex libraries like OfficeLibre.

Extending this solution

This blog post describes the basic building blocks for the solution, but it can be extended relatively easily. For example, you could add another function to the API that would convert, resize, and watermark images. To do this, create an S3 Object Lambda function to perform those tasks. Then, add an S3 Object Lambda Access Point to invoke it based on a different API call.

API Gateway has many built-in security features, but you may want to enhance the security of your RESTful API. To do this, add enhanced security rules via AWS WAF. Integrating your IdP into Amazon Cognito can give you a single place to manage your users.

Monitoring any solution is critical, and understanding how an application is behaving end to end can greatly benefit optimization and troubleshooting. Adding AWS X-Ray and Amazon CloudWatch Lambda Insights will show you how functions and their interactions are performing.

Should you decide to extend this architecture, follow the architectural principles defined in AWS Well-Architected, and pay particular attention to the Serverless Application Lens.

Figure 3. Example expanded document processing architecture

Conclusion

You can implement this solution in a number of ways. However, by using S3 Object Lambda, you can transform documents without needing intermediary storage. S3 Object Lambda will also decouple your file logic from the rest of the application.

The Serverless on AWS components mentioned in this post allow you to reduce administrative overhead, saving you time and money.

Finally, the extensible nature of this architecture allows you to add functionality easily as your organization’s needs grow and change.

The following links provide more information on how to use S3 Object Lambda in your architectures:

Build Next-Generation Microservices with .NET 5 and gRPC on AWS

2021-08-23 Matt Cline

Post Syndicated from Matt Cline original https://aws.amazon.com/blogs/devops/next-generation-microservices-dotnet-grpc/

Modern architectures use multiple microservices in conjunction to drive customer experiences. At re:Invent 2015, AWS senior project manager Rob Brigham described Amazon’s architecture of many single-purpose microservices – including ones that render the “Buy” button, calculate tax at checkout, and hundreds more.

Microservices commonly communicate with JSON over HTTP/1.1. These technologies are ubiquitous and human-readable, but they aren’t optimized for communication between dozens or hundreds of microservices.

Next-generation Web technologies, including gRPC and HTTP/2, significantly improve communication speed and efficiency between microservices. AWS offers the most compelling experience for builders implementing microservices. Moreover, the addition of HTTP/2 and gRPC support in Application Load Balancer (ALB) provides an end-to-end solution for next-generation microservices. ALBs can inspect and route gRPC calls, enabling features like health checks, access logs, and gRPC-specific metrics.

This post demonstrates .NET microservices communicating with gRPC via Application Load Balancers. The microservices run on AWS Graviton2 instances, utilizing a custom-built 64-bit Arm processor to deliver up to 40% better price/performance than x86.

Architecture Overview

Modern Tacos is a new restaurant offering delivery. Customers place orders via mobile app, then they receive real-time status updates as their order is prepared and delivered.

The tutorial includes two microservices: “Submit Order” and “Track Order”. The Submit Order service receives orders from the app, then it calls the Track Order service to initiate order tracking. The Track Order service provides streaming updates to the app as the order is prepared and delivered.

Each microservice is deployed in an Amazon EC2 Auto Scaling group. Each group is behind an ALB that routes gRPC traffic to instances in the group.

Shows the communication flow of gRPC traffic from users through an ALB to EC2 instances. — This architecture is simplified to focus on ALB and gRPC functionality. Microservices are often deployed in
containers for elastic scaling, improved reliability, and efficient resource utilization. ALB, gRPC, and .NET all work equally effectively in these architectures.

Comparing gRPC and JSON for microservices

Microservices typically communicate by sending JSON data over HTTP. As a text-based format, JSON is readable, flexible, and widely compatible. However, JSON also has significant weaknesses as a data interchange format. JSON’s flexibility makes enforcing a strict API specification difficult — clients can send arbitrary or invalid data, so developers must write rigorous data validation code. Additionally, performance can suffer at scale due to JSON’s relatively high bandwidth and parsing requirements. These factors also impact performance in constrained environments, such as smartphones and IoT devices. gRPC addresses all of these issues.

gRPC is an open-source framework designed to efficiently connect services. Instead of JSON, gRPC sends messages via a compact binary format called Protocol Buffers, or protobuf. Although protobuf messages are not human-readable, they utilize less network bandwidth and are faster to encode and decode. Operating at scale, these small differences multiply to a significant performance gain.

gRPC APIs define a strict contract that is automatically enforced for all messages. Based on this contract, gRPC implementations generate client and server code libraries in multiple programming languages. This allows developers to use higher-level constructs to call services, rather than programming against “raw” HTTP requests.

gRPC also benefits from being built on HTTP/2, a major revision of the HTTP protocol. In addition to the foundational performance and efficiency improvements from HTTP/2, gRPC utilizes the new protocol to support bi-directional streaming data. Implementing real-time streaming prior to gRPC typically required a completely separate protocol (such as WebSockets) that might not be supported by every client.

gRPC for .NET developers

Several recent updates have made gRPC more useful to .NET developers. .NET 5 includes significant performance improvements to gRPC, and AWS has broad support for .NET 5. In May 2021, the .NET team announced their focus on a gRPC implementation written entirely in C#, called “grpc-dotnet”, which follows C# conventions very closely.

Instead of working with JSON, dynamic objects, or strings, C# developers calling a gRPC service use a strongly-typed client, automatically generated from the protobuf specification. This obviates much of the boilerplate validation required by JSON APIs, and it enables developers to use rich data structures. Additionally, the generated code enables full IntelliSense support in Visual Studio.

For example, the “Submit Order” microservice executes this code in order to call the “Track Order” microservice:

using var channel = GrpcChannel.ForAddress("https://track-order.example.com");

var trackOrderClient = new TrackOrder.Protos.TrackOrder.TrackOrderClient(channel);

var reply = await trackOrderClient.StartTrackingOrderAsync(new TrackOrder.Protos.Order
{
    DeliverTo = "Address",
    LastUpdated = Timestamp.FromDateTime(DateTime.UtcNow),
    OrderId = order.OrderId,
    PlacedOn = order.PlacedOn,
    Status = TrackOrder.Protos.OrderStatus.Placed
});

This code calls the StartTrackingOrderAsync method on the Track Order client, which looks just like a local method call. The method intakes a data structure that supports rich data types like DateTime and enumerations, instead of the loosely-typed JSON. The methods and data structures are defined by the Track Order service’s protobuf specification, and the .NET gRPC tools automatically generate the client and data structure classes without requiring any developer effort.

Configuring ALB for gRPC

To make gRPC calls to targets behind an ALB, create a load balancer target group and select gRPC as the protocol version. You can do this through the AWS Management Console, AWS Command Line Interface (CLI), AWS CloudFormation, or AWS Cloud Development Kit (CDK).

Screenshot of the AWS Management Console, showing how to configure a load balancer's target group for gRPC communication.

This CDK code creates a gRPC target group:

var targetGroup = new ApplicationTargetGroup(this, "TargetGroup", new ApplicationTargetGroupProps
{
    Protocol = ApplicationProtocol.HTTPS,
    ProtocolVersion = ApplicationProtocolVersion.GRPC,
    Vpc = vpc,
    Targets = new IApplicationLoadBalancerTarget {...}
});

gRPC requests work with target groups utilizing HTTP/2, but the gRPC protocol enables additional features including health checks, request count metrics, access logs that differentiate gRPC requests, and gRPC-specific response headers. gRPC also works with native ALB features like stickiness, multiple load balancing algorithms, and TLS termination.

Deploy the Tutorial

The sample provisions AWS resources via the AWS Cloud Development Kit (CDK). The CDK code is provided in C# so that .NET developers can use a familiar language.

The solution deployment steps include:

Configuring a domain name in Route 53.
Deploying the microservices.
Running the mobile app on AWS Device Farm.

The source code is available on GitHub.

Prerequisites

For this tutorial, you should have these prerequisites:

Sign up for an AWS account.
Complete the AWS CDK Getting Started guide.
Install the AWS CLI and set up your AWS credentials for command-line use – or, you can use the AWS Tools for PowerShell and set up your AWS credentials for PowerShell.
Create a public hosted zone in Amazon Route 53 for a domain name that you control. This will be the “parent” domain name for the microservices.
Install Visual Studio 2019.
Clone the GitHub repository to your computer.
Open a terminal (such as Bash) or a PowerShell prompt.

Configure the environment variables needed by the CDK. In the sample commands below, replace AWS_ACCOUNT_ID with your numeric AWS account ID. Replace AWS_REGION with the name of the region where you will deploy the sample, such as us-east-1 or us-west-2.

If you’re using a *nix shell such as Bash, run these commands:

export CDK_DEFAULT_ACCOUNT=AWS_ACCOUNT_ID
export CDK_DEFAULT_REGION=AWS_REGION

If you’re using PowerShell, run these commands:

$Env:CDK_DEFAULT_ACCOUNT="AWS_ACCOUNT_ID"
$Env:CDK_DEFAULT_REGION="AWS_REGION"
Set-DefaultAWSRegion -Region AWS_REGION

Throughout this tutorial, replace RED TEXT with the appropriate value.

Save the directory path where you cloned the GitHub repository. In the sample commands below, replace EXAMPLE_DIRECTORY with this path.

In your terminal or PowerShell, run these commands:

cd EXAMPLE_DIRECTORY/src/ModernTacoShop/Common/cdk
cdk bootstrap --context domain-name=PARENT_DOMAIN_NAME
cdk deploy --context domain-name=PARENT_DOMAIN_NAME

The CDK output includes the name of the S3 bucket that will store deployment packages. Save the name of this bucket. In the sample commands below, replace SHARED_BUCKET_NAME with this name.

Deploy the Track Order microservice

Compile the Track Order microservice for the Arm microarchitecture utilized by AWS Graviton2 processors. The TrackOrder.csproj file includes a target that automatically packages the compiled microservice into a ZIP file. You will upload this ZIP file to S3 for use by CodeDeploy. Next, you will utilize the CDK to deploy the microservice’s AWS infrastructure, and then install the microservice on the EC2 instance via CodeDeploy.

The CDK stack deploys these resources:

An Amazon EC2 Auto Scaling group.
An Application Load Balancer (ALB) using gRPC, targeting the Auto Scaling group and configured with microservice health checks.
A subdomain for the microservice, targeting the ALB.
A DynamoDB table used by the microservice.
CodeDeploy infrastructure to deploy the microservice to the Auto Scaling group.

If you’re using the AWS CLI, run these commands:

cd EXAMPLE_DIRECTORY/src/ModernTacoShop/TrackOrder/src/
dotnet publish --runtime linux-arm64 --self-contained
aws s3 cp ./bin/TrackOrder.zip s3://SHARED_BUCKET_NAME
etag=$(aws s3api head-object --bucket SHARED_BUCKET_NAME \
    --key TrackOrder.zip --query ETag --output text)
cd ../cdk
cdk deploy

The CDK output includes the name of the CodeDeploy deployment group. Use this name to run the next command:

aws deploy create-deployment --application-name ModernTacoShop-TrackOrder \
    --deployment-group-name TRACK_ORDER_DEPLOYMENT_GROUP_NAME \
    --s3-location bucket=SHARED_BUCKET_NAME,bundleType=zip,key=TrackOrder.zip,etag=$etag \
    --file-exists-behavior OVERWRITE

If you’re using PowerShell, run these commands:

cd EXAMPLE_DIRECTORY/src/ModernTacoShop/TrackOrder/src/
dotnet publish --runtime linux-arm64 --self-contained
Write-S3Object -BucketName SHARED_BUCKET_NAME `
    -Key TrackOrder.zip `
    -File ./bin/TrackOrder.zip
Get-S3ObjectMetadata -BucketName SHARED_BUCKET_NAME `
    -Key TrackOrder.zip `
    -Select ETag `
    -OutVariable etag
cd ../cdk
cdk deploy

The CDK output includes the name of the CodeDeploy deployment group. Use this name to run the next command:

New-CDDeployment -ApplicationName ModernTacoShop-TrackOrder `
    -DeploymentGroupName TRACK_ORDER_DEPLOYMENT_GROUP_NAME `
    -S3Location_Bucket SHARED_BUCKET_NAME `
    -S3Location_BundleType zip `
    -S3Location_Key TrackOrder.zip `
    -S3Location_ETag $etag[0] `
    -RevisionType S3 `
    -FileExistsBehavior OVERWRITE

Deploy the Submit Order microservice

The steps to deploy the Submit Order microservice are identical to the Track Order microservice. See that section for details.

If you’re using the AWS CLI, run these commands:

cd EXAMPLE_DIRECTORY/src/ModernTacoShop/SubmitOrder/src/
dotnet publish --runtime linux-arm64 --self-contained
aws s3 cp ./bin/SubmitOrder.zip s3://SHARED_BUCKET_NAME
etag=$(aws s3api head-object --bucket SHARED_BUCKET_NAME \
    --key SubmitOrder.zip --query ETag --output text)
cd ../cdk
cdk deploy

The CDK output includes the name of the CodeDeploy deployment group. Use this name to run the next command:

aws deploy create-deployment --application-name ModernTacoShop-SubmitOrder \
    --deployment-group-name SUBMIT_ORDER_DEPLOYMENT_GROUP_NAME \
    --s3-location bucket=SHARED_BUCKET_NAME,bundleType=zip,key=SubmitOrder.zip,etag=$etag \
    --file-exists-behavior OVERWRITE

If you’re using PowerShell, run these commands:

cd EXAMPLE_DIRECTORY/src/ModernTacoShop/SubmitOrder/src/
dotnet publish --runtime linux-arm64 --self-contained
Write-S3Object -BucketName SHARED_BUCKET_NAME `
    -Key SubmitOrder.zip `
    -File ./bin/SubmitOrder.zip
Get-S3ObjectMetadata -BucketName SHARED_BUCKET_NAME `
    -Key SubmitOrder.zip `
    -Select ETag `
    -OutVariable etag
cd ../cdk
cdk deploy

The CDK output includes the name of the CodeDeploy deployment group. Use this name to run the next command:

New-CDDeployment -ApplicationName ModernTacoShop-SubmitOrder `
    -DeploymentGroupName SUBMIT_ORDER_DEPLOYMENT_GROUP_NAME `
    -S3Location_Bucket SHARED_BUCKET_NAME `
    -S3Location_BundleType zip `
    -S3Location_Key SubmitOrder.zip `
    -S3Location_ETag $etag[0] `
    -RevisionType S3 `
    -FileExistsBehavior OVERWRITE

Data flow diagram

Architecture diagram showing the complete data flow of the sample gRPC microservices application. — The app submits an order via gRPC.

The Submit Order ALB routes the gRPC call to an instance.

The Submit Order instance stores order data.

The Submit Order instance calls the Track Order service via gRPC.

The Track Order ALB routes the gRPC call to an instance.

The Track Order instance stores tracking data.

The app calls the Track Order service, which streams the order’s location during delivery.

Test the microservices

Once the CodeDeploy deployments have completed, test both microservices.

First, check the load balancers’ status. Go to Target Groups in the AWS Management Console, which will list one target group for each microservice. Click each target group, then click “Targets” in the lower details pane. Every EC2 instance in the target group should have a “healthy” status.

Next, verify each microservice via gRPCurl. This tool lets you invoke gRPC services from the command line. Install gRPCurl using the instructions, and then test each microservice:

grpcurl submit-order.PARENT_DOMAIN_NAME:443 modern_taco_shop.SubmitOrder/HealthCheck
grpcurl track-order.PARENT_DOMAIN_NAME:443 modern_taco_shop.TrackOrder/HealthCheck

If a service is healthy, it will return an empty JSON object.

Run the mobile app

You will run a pre-compiled version of the app on AWS Device Farm, which lets you test on a real device without managing any infrastructure. Alternatively, compile your own version via the AndroidApp.FrontEnd project within the solution located at EXAMPLE_DIRECTORY/src/ModernTacoShop/AndroidApp/AndroidApp.sln.

Go to Device Farm in the AWS Management Console. Under “Mobile device testing projects”, click “Create a new project”. Enter “ModernTacoShop” as the project name, and click “Create Project”. In the ModernTacoShop project, click the “Remote access” tab, then click “Start a new session”. Under “Choose a device”, select the Google Pixel 3a running OS version 10, and click “Confirm and start session”.

Screenshot of the AWS Device Farm showing a Google Pixel 3a.

Once the session begins, click “Upload” in the “Install applications” section. Unzip and upload the APK file located at EXAMPLE_DIRECTORY/src/ModernTacoShop/AndroidApp/com.example.modern_tacos.grpc_tacos.apk.zip, or upload an APK that you created.

Screenshot of the gRPC microservices demo Android app, showing the map that displays streaming location data.

Screenshot of the gRPC microservices demo Android app, on the order preparation screen.

Once the app has uploaded, drag up from the bottom of the device screen in order to reach the “All apps” screen. Click the ModernTacos app to launch it.

Once the app launches, enter the parent domain name in the “Domain Name” field. Click the “+” and “-“ buttons next to each type of taco in order to create your order, then click “Submit Order”. The order status will initially display as “Preparing”, and will switch to “InTransit” after about 30 seconds. The Track Order service will stream a random route to the app, updating with new position data every 5 seconds. After approximately 2 minutes, the order status will change to “Delivered” and the streaming updates will stop.

Once you’ve run a successful test, click “Stop session” in the console.

Cleaning up

To avoid incurring charges, use the cdk destroy command to delete the stacks in the reverse order that you deployed them.

You can also delete the resources via CloudFormation in the AWS Management Console.

In addition to deleting the stacks, you must delete the Route 53 hosted zone and the Device Farm project.

Conclusion

This post demonstrated multiple next-generation technologies for microservices, including end-to-end HTTP/2 and gRPC communication over Application Load Balancer, AWS Graviton2 processors, and .NET 5. These technologies enable builders to create microservices applications with new levels of performance and efficiency.

Matt Cline

Matt Cline is a Solutions Architect at Amazon Web Services, supporting customers in his home city of Pittsburgh PA. With a background as a full-stack developer and architect, Matt is passionate about helping customers deliver top-quality applications on AWS. Outside of work, Matt builds (and occasionally finishes) scale models and enjoys running a tabletop role-playing game for his friends.

Ulili Nhaga

Ulili Nhaga is a Cloud Application Architect at Amazon Web Services in San Diego, California. He helps customers modernize, architect, and build highly scalable cloud-native applications on AWS. Outside of work, Ulili loves playing soccer, cycling, Brazilian BBQ, and enjoying time on the beach.

How MEDHOST’s cardiac risk prediction successfully leveraged AWS analytic services

2021-08-23 Pandian Velayutham

Post Syndicated from Pandian Velayutham original https://aws.amazon.com/blogs/big-data/how-medhosts-cardiac-risk-prediction-successfully-leveraged-aws-analytic-services/

MEDHOST has been providing products and services to healthcare facilities of all types and sizes for over 35 years. Today, more than 1,000 healthcare facilities are partnering with MEDHOST and enhancing their patient care and operational excellence with its integrated clinical and financial EHR solutions. MEDHOST also offers a comprehensive Emergency Department Information System with business and reporting tools. Since 2013, MEDHOST’s cloud solutions have been utilizing Amazon Web Services (AWS) infrastructure, data source, and computing power to solve complex healthcare business cases.

MEDHOST can utilize the data available in the cloud to provide value-added solutions for hospitals solving complex problems, like predicting sepsis, cardiac risk, and length of stay (LOS) as well as reducing re-admission rates. This requires a solid foundation of data lake and elastic data pipeline to keep up with multi-terabyte data from thousands of hospitals. MEDHOST has invested a significant amount of time evaluating numerous vendors to determine the best solution for its data needs. Ultimately, MEDHOST designed and implemented machine learning/artificial intelligence capabilities by leveraging AWS Data Lab and an end-to-end data lake platform that enables a variety of use cases such as data warehousing for analytics and reporting.

Since you’re reading this post, you may also be interested in the following:

MEDHOST expands multi-tenant cloud EHR capabilities with AWS (AWS for Industries)

Getting started

MEDHOST’s initial objectives in evaluating vendors were to:

Build a low-cost data lake solution to provide cardiac risk prediction for patients based on health records
Provide an analytical solution for hospital staff to improve operational efficiency
Implement a proof of concept to extend to other machine learning/artificial intelligence solutions

The AWS team proposed AWS Data Lab to architect, develop, and test a solution to meet these objectives. The collaborative relationship between AWS and MEDHOST, AWS’s continuous innovation, excellent support, and technical solution architects helped MEDHOST select AWS over other vendors and products. AWS Data Lab’s well-structured engagement helped MEDHOST define clear, measurable success criteria that drove the implementation of the cardiac risk prediction and analytical solution platform. The MEDHOST team consisted of architects, builders, and subject matter experts (SMEs). By connecting MEDHOST experts directly to AWS technical experts, the MEDHOST team gained a quick understanding of industry best practices and available services allowing MEDHOST team to achieve most of the success criteria at the end of a four-day design session. MEDHOST is now in the process of moving this work from its lower to upper environment to make the solution available for its customers.

Solution

For this solution, MEDHOST and AWS built a layered pipeline consisting of ingestion, processing, storage, analytics, machine learning, and reinforcement components. The following diagram illustrates the Proof of Concept (POC) that was implemented during the four-day AWS Data Lab engagement.

Ingestion layer

The ingestion layer is responsible for moving data from hospital production databases to the landing zone of the pipeline.

The hospital data was stored in an Amazon RDS for PostgreSQL instance and moved to the landing zone of the data lake using AWS Database Migration Service (DMS). DMS made migrating databases to the cloud simple and secure. Using its ongoing replication feature, MEDHOST and AWS implemented change data capture (CDC) quickly and efficiently so MEDHOST team could spend more time focusing on the most interesting parts of the pipeline.

Processing layer

The processing layer was responsible for performing extract, tranform, load (ETL) on the data to curate them for subsequent uses.

MEDHOST used AWS Glue within its data pipeline for crawling its data layers and performing ETL tasks. The hospital data copied from RDS to Amazon S3 was cleaned, curated, enriched, denormalized, and stored in parquet format to act as the heart of the MEDHOST data lake and a single source of truth to serve any further data needs. During the four-day Data Lab, MEDHOST and AWS targeted two needs: powering MEDHOST’s data warehouse used for analytics and feeding training data to the machine learning prediction model. Even though there were multiple challenges, data curation is a critical task which requires an SME. AWS Glue’s serverless nature, along with the SME’s support during the Data Lab, made developing the required transformations cost efficient and uncomplicated. Scaling and cluster management was addressed by the service, which allowed the developers to focus on cleaning data coming from homogenous hospital sources and translating the business logic to code.

Storage layer

The storage layer provided low-cost, secure, and efficient storage infrastructure.

MEDHOST used Amazon S3 as a core component of its data lake. AWS DMS migration tasks saved data to S3 in .CSV format. Crawling data with AWS Glue made this landing zone data queryable and available for further processing. The initial AWS Glue ETL job stored the parquet formatted data to the data lake and its curated zone bucket. MEDHOST also used S3 to store the .CSV formatted data set that will be used to train, test, and validate its machine learning prediction model.

Analytics layer

The analytics layer gave MEDHOST pipeline reporting and dashboarding capabilities.

The data was in parquet format and partitioned in the curation zone bucket populated by the processing layer. This made querying with Amazon Athena or Amazon Redshift Spectrum fast and cost efficient.

From the Amazon Redshift cluster, MEDHOST created external tables that were used as staging tables for MEDHOST data warehouse and implemented an UPSERT logic to merge new data in its production tables. To showcase the reporting potential that was unlocked by the MEDHOST analytics layer, a connection was made to the Redshift cluster to Amazon QuickSight. Within minutes MEDHOST was able to create interactive analytics dashboards with filtering and drill-down capabilities such as a chart that showed the number of confirmed disease cases per US state.

Machine learning layer

The machine learning layer used MEDHOST’s existing data sets to train its cardiac risk prediction model and make it accessible via an endpoint.

Before getting into Data Lab, the MEDHOST team was not intimately familiar with machine learning. AWS Data Lab architects helped MEDHOST quickly understand concepts of machine learning and select a model appropriate for its use case. MEDHOST selected XGBoost as its model since cardiac prediction falls within regression technique. MEDHOST’s well architected data lake enabled it to quickly generate training, testing, and validation data sets using AWS Glue.

Amazon SageMaker abstracted underlying complexity of setting infrastructure for machine learning. With few clicks, MEDHOST started Jupyter notebook and coded the components leading to fitting and deploying its machine learning prediction model. Finally, MEDHOST created the endpoint for the model and ran REST calls to validate the endpoint and trained model. As a result, MEDHOST achieved the goal of predicting cardiac risk. Additionally, with Amazon QuickSight’s SageMaker integration, AWS made it easy to use SageMaker models directly in visualizations. QuickSight can call the model’s endpoint, send the input data to it, and put the inference results into the existing QuickSight data sets. This capability made it easy to display the results of the models directly in the dashboards. Read more about QuickSight’s SageMaker integration here.

Reinforcement layer

Finally, the reinforcement layer guaranteed that the results of the MEDHOST model were captured and processed to improve performance of the model.

The MEDHOST team went beyond the original goal and created an inference microservice to interact with the endpoint for prediction, enabled abstracting of the machine learning endpoint with the well-defined domain REST endpoint, and added a standard security layer to the MEDHOST application.

When there is a real-time call from the facility, the inference microservice gets inference from the SageMaker endpoint. Records containing input and inference data are fed to the data pipeline again. MEDHOST used Amazon Kinesis Data Streams to push records in real time. However, since retraining the machine learning model does not need to happen in real time, the Amazon Kinesis Data Firehose enabled MEDHOST to micro-batch records and efficiently save them to the landing zone bucket so that the data could be reprocessed.

Conclusion

Collaborating with AWS Data Lab enabled MEDHOST to:

Store single source of truth with low-cost storage solution (data lake)
Complete data pipeline for a low-cost data analytics solution
Create an almost production-ready code for cardiac risk prediction

The MEDHOST team learned many concepts related to data analytics and machine learning within four days. AWS Data Lab truly helped MEDHOST deliver results in an accelerated manner.

About the Authors

Pandian Velayutham is the Director of Engineering at MEDHOST. His team is responsible for delivering cloud solutions, integration and interoperability, and business analytics solutions. MEDHOST utilizes modern technology stack to provide innovative solutions to our customers. Pandian Velayutham is a technology evangelist and public cloud technology speaker.

George Komninos is a Data Lab Solutions Architect at AWS. He helps customers convert their ideas to a production-ready data product. Before AWS, he spent 3 years at Alexa Information domain as a data engineer. Outside of work, George is a football fan and supports the greatest team in the world, Olympiacos Piraeus.

Queue Integration with Third-party Services on AWS

2021-08-23 Rostislav Markov

Post Syndicated from Rostislav Markov original https://aws.amazon.com/blogs/architecture/queue-integration-with-third-party-services-on-aws/

Commercial off-the-shelf software and third-party services can present an integration challenge in event-driven workflows when they do not natively support AWS APIs. This is even more impactful when a workflow is subject to unpredicted usage spikes, and you want to increase decoupling and fault tolerance. Given the third-party nature of services, polling an Amazon Simple Queue Service (SQS) queue and having built-in AWS API handling logic may not be an immediate option.

In such cases, AWS Lambda helps out-task the Amazon SQS queue integration and AWS API handling to an additional layer. The success of this depends on how well exception handling is implemented across the different interacting services. In this blog post, we outline issues to consider when adopting this design pattern. We also share a reusable solution.

Design pattern for third-party integration with SQS

With this design pattern, one or more services (producers) asynchronously invoke other third-party downstream consumer services. They publish messages to an Amazon SQS queue, which acts as buffer for requests. Producers provide all commands and other parameters required for consumer service execution with the message.

As messages are written to the queue, the queue is configured to invoke a message broker (implemented as AWS Lambda) for each message. AWS Lambda can interact natively with target AWS services such as Amazon EC2, Amazon Elastic Container Service (ECS), or Amazon Elastic Kubernetes Service (EKS). It can also be configured to use an Amazon Virtual Private Cloud (VPC) interface endpoint to establish a connection to VPC resources without traversing the internet. The message broker assigns the tasks to consumer services by invoking the RunTask API of Amazon ECS and AWS Fargate (see Figure 1.)

Figure 1. On-premises and AWS queue integration for third-party services using AWS Lambda

The message broker asynchronously invokes the API in ‘fire-and-forget’ mode. Therefore, error handling must be built in to respond to API invocation errors. In an event-driven scenario, an error will be invoked if you asynchronously call the third-party service hundreds or thousands of times and reach Service Quotas. This is a potential issue with RunTask API actions, or a large volume of concurrent tasks running on AWS Fargate. Two mechanisms can help implement troubleshooting API request errors.

API retries with exponential backoff. The message broker retries for a number of times with configurable sleep intervals and exponential backoff in-between. This enforces progressively longer waits between retries for consecutive error responses. If the RunTask API fails to process the request and initiate the third-party service, the message remains in the queue for a subsequent retry. The AWS General Reference provides further guidance.
API error handling. Error handling and consequent logging should be implemented at every step. Since there are several services working together in tandem, crucial debugging information from errors may be lost. Additionally, error handling also provides opportunity to define automated corrective actions or notifications on event occurrence. The message broker can publish failure notifications including the root cause to an Amazon Simple Notification Service (SNS) topic.

SNS topic subscription can be configured via different protocols. You can email a distribution group for active monitoring and processing of errors. If persistence is required for messages that failed to process, error handling can be associated directly with SQS by configuring a dead letter queue.

Reference implementation for third-party integration with SQS

We implemented the design pattern in Figure 1, with Broad Institute’s Cell Painting application workflow. This is for morphological profiling from microscopy cell images running on Amazon EC2. It interacts with CellProfiler version 3.0 cell image analysis software as the downstream consumer hosted on ECS/Fargate. Every invocation of CellProfiler required approximately 1,500 tasks for a single processing step.

Resource constraints determined the rate of scale-out. In this case, it was for an Amazon ECS task creation. Address space for Amazon ECS subnets should be large enough to prevent running out of available IPs within your VPC. If Amazon ECS Service Quotas provide further constraints, a quota increase can be requested.

Exceptions must be handled both when validating and initiating requests. As part of the validation workflow, exceptions are captured as follows, also shown in Figure 2.

1. Invalid arguments exception. The message broker validates that the SQS message contains all the needed information to initiate the ECS task. This information includes subnets, security groups and container names required to start the ECS task, and else raises exception.

2. Retry limit exception. On each iteration, the message broker will evaluate whether the SQS retry limit has been reached, before invoking the RunTask API. It will then exit, by sending failure notification to SNS when the retry limit is reached.

Figure 2. Exception handling flow during request validation

As part of the initiation workflow, exceptions are handled as follows, shown in Figure 3:

1. ECS/Fargate API and concurrent execution limitations. The message broker catches API exceptions when calling the API RunTask operation. These exceptions can include:

- When the call to the launch tasks exceeds the maximum allowed API request limit for your AWS account
- When failing to retrieve security group information
- When you have reached the limit on the number of tasks you can run concurrently

With each of the preceding exceptions, the broker will increase the retry count.

2. Networking and IP space limitations. Network interface timeouts received after initiating the ECS task set off an Amazon CloudWatch Events rule, causing the message broker to re-initiate the ECS task.

Figure 3. Exception handling flow during request initiation

While we specifically address downstream consumer services running on ECS/Fargate, this solution can be adjusted for third-party services running on Amazon EC2 or EKS. With EC2, the message broker must be adjusted to interact with the RunInstances API, and include troubleshooting API request errors. Integration with downstream consumers on Amazon EKS requires that the AWS Lambda function is associated via the IAM role with a Kubernetes service account. A Python client for Kubernetes can be used to simplify interaction with the Kubernetes REST API and AWS Lambda would invoke the run API.

Conclusion

This pattern is useful when queue polling is not an immediate option. This is typical with event-driven workflows involving third-party services and vendor applications subject to unpredictable, intermittent load spikes. Exception handling is essential for these types of workflows. Offloading AWS API handling to a separate layer orchestrated by AWS Lambda can improve the resiliency of such third-party services on AWS. This pattern represents an incremental optimization until the third party provides native SQS integration. It can be achieved with the initial move to AWS, for example as part of the V1 AWS design strategy for third-party services.

Some limitations should be acknowledged. While the pattern enables graceful failure, it does not prevent the overloading of the ECS RunTask API. By invoking Amazon ECS RunTask API in ‘fire-and-forget’ mode, it does not monitor service execution once a task was successfully invoked. Therefore, it should be adopted when direct queue polling is not an option. In our example, Broad Institute’s CellProfiler application enabled direct queue polling with its subsequent product version of Distributed CellProfiler.

Further reading

The referenced deployment with consumer services on Amazon ECS can be accessed via AWSLabs.

Simplify data discovery for business users by adding data descriptions in the AWS Glue Data Catalog

2021-08-23 Karim Hammouda

Post Syndicated from Karim Hammouda original https://aws.amazon.com/blogs/big-data/simplify-data-discovery-for-business-users-by-adding-data-descriptions-in-the-aws-glue-data-catalog/

In this post, we discuss how to use AWS Glue Data Catalog to simplify the process for adding data descriptions and allows data analysts to access, search, and discover this cataloged metadata with BI tools.

In this solution, we use AWS Glue Data Catalog, to break the silos between cross-functional data producer teams, sometimes also known as domain data experts, and business-focused consumer teams that author business intelligence (BI) reports and dashboards.

Since you’re reading this post, you may also be interested in the following:

Build a DataOps platform to break the silos between data engineers and data analysts around running business logic for data processing

Data democratization and the need for self-service BI

To be able to extract insights and get value out of organizational-wide data assets, data consumers like data analysts need to understand the meaning of existing data assets. They rely on data platform engineers to perform such data discovery tasks on their behalf.

Although data platform engineers can programmatically extract and obtain some technical and operational metadata, such as database and table names and sizes, column schemas, and keys, this metadata is primarily used for organizing and manipulating data inside the data lake. They still rely on source data domain experts to gain more knowledge about the meaning of the data, its business context, and classification. It becomes more challenging when data domain experts tend to prioritize operational-critical requests and delay the analytical-related ones.

Such a cycled dependency, as illustrated in the following figure, can delay the organizational strategic vision for implementing a self-service data analytics platform to reduce the time of the data-to-insights process.

Solution overview

The Data Catalog fundamentally holds basic information about the actual data stored in various data sources, including but not limited to Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), and Amazon Redshift. Information like data location, format, and columns schema can be automatically discovered and stored as tables, where each table specifies a single data store.

Throughout this post, we see how we can use the Data Catalog to make it easy for domain experts to add data descriptions, and for data analysts to access this metadata with BI tools.

First, we use the comment field in Data Catalog schema tables to store data descriptions with more business meaning. Comment fields aren’t like the other schema table fields (such as column name, data type, and partition key), which are typically populated automatically by an AWS Glue crawler.

We also use Amazon AI/ML capabilities to initially identify the description of each data entity. One way to do that is by using the Amazon Comprehend text analysis API. When we provide a sample of values for each data entity type, Amazon Comprehend natural language processing (NLP) models can identify a standard range of data classification, and we can use this as a description for identified data entities.

Next, because we need to identify entities unique to our domain or organization, we can use custom named entity recognition (NER) in Amazon Comprehend to add more metadata that is related to our business domain. One way to train custom NER models is to use Amazon SageMaker Ground Truth; for more information, see Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend.

For this post, we use a dataset that has a table schema defined as per TPC-DS, and was generated using a data generator developed as part of AWS Analytics Reference Architecture code samples.

In this example, Amazon Comprehend API recognizes PII-related fields like Aid as a MAC address. While the none PII-related fields like Estatus, aren’t recognized. Therefore, the user enters a custom description manually, and we use the custom NER to automatically populate those fields, as shown in the following diagram.

After we add data meanings, we need to expose all the metadata captured in the Data Catalog to various data consumers. This can be done two different ways:

Via the AWS Glue Data Catalog API, which is more suitable for technical users comprehending ETL jobs programmatically using services like AWS Glue ETL or Amazon EMR
Via SQL-querying to information_schema for data analysts performing ad hoc SQL querying using services like Amazon Athena

We can also use the latter method to expose the Data Catalog to BI authors comprehending data analyses and dashboards using Amazon QuickSight, so we use the second method for this post.

We do this by defining an Athena dataset that queries the information_schema and allows BI authors to use the QuickSight capability of text search filter to search and discover data using its business meaning (see the following diagram).

Solution details

The core part of this solution is done using AWS Glue jobs. We use two AWS Glue jobs, which are responsible for calling Amazon Comprehend APIs and updating the AWS Glue Data Catalog with added data descriptions accordingly.

The first job (Glue_Comprehend_Job) performs the first stage of detection using the Amazon Comprehend Detect PII API, and the second job (Glue_Comprehend_Custom) uses Amazon Comprehend custom entity recognition for entities labeled by domain experts. The following diagram illustrates this workflow.

We describe the details of each stage in the upcoming sections.

You can integrate this workflow into your existing data processing pipeline, which might be orchestrated with AWS services like AWS Step Functions, Amazon Managed Workflows for Apache Airflow (Amazon MWAA), AWS Glue workflows, or any third-party orchestrator.

The workflow can complement AWS Glue crawler functionality and inherit the same logic for scheduling and running crawlers. On the other end, we can query the updated Data Catalog with data descriptions via Athena (see the following diagram).

To show an end-to-end implementation of this solution, we have adopted a choreographically built architecture with additional AWS Lambda helper functions, which communicate between AWS services, triggering the AWS Glue crawler and AWS Glue jobs.

Stage-one: Enrich the Data Catalog with a standard built-in Amazon Comprehend entity detector

To get started, Choose to launch a CloudFormation stack.

Define unique S3 bucket name and on the CloudFormation console, accept default values for the parameters.

This CloudFormation stack consists of the following:

An AWS Identity Access Management (IAM) role called Lambda-S3-Glue-comprehend.
An S3 bucket with a bucket name that can be defined based on preference.
A Lambda function called trigger_data_cataloging. This function is automatically triggered when any CSV file is uploaded to the folder row_data inside our S3 bucket. Then it creates an AWS Glue database if one doesn’t exist, and creates and runs an AWS Glue crawler called glue_crawler_comprehend.
An AWS Glue job called Glue_Comprehend_Job, which calls Amazon Comprehend APIs and updates the AWS Glue Data Catalog table accordingly.
A Lambda function called Glue_comprehend_workflow, which is triggered when the AWS Glue Crawler successfully finishes and calls the AWS Glue job Glue_Comprehend_Job.

To test the solution, create a prefix called row_data under the S3 bucket created from the CF stack, then upload the customer dataset sample to the prefix.

The first Lambda function is triggered to run the subsequent AWS Glue crawler and AWS Glue job to get data descriptions using Amazon Comprehend, and it updates the comment section of the dataset created in the AWS Glue Data Catalog.

Stage-two: Use Amazon Comprehend custom entity recognition

Amazon Comprehend was able to detect some of the entity types within the customer sample dataset. However, for the remaining undetected fields, we can get help from a domain data expert to label a sample dataset using Ground Truth. Then we use the labeled data output to train a custom NER model and rerun the AWS Glue job to update the comment column with a customized data description.

Train an Amazon Comprehend custom entity recognition model

One way to train Amazon Comprehend custom entity recognizers is to get augmented manifest information using Ground Truth to label the data. Ground Truth has a built-in NER task for creating labeling jobs so domain experts can identify entities in text. To learn more about how to create the job, see Named Entity Recognition.

As an example, we tagged three labels entities: customer information ID, current level of education, and customer credit rating. The domain experts get a web interface like one shown in the following screenshot to label the dataset.

We can use the output of the labeling job to train an Amazon Comprehend custom entity recognition model using the augmented manifest.

The augmented manifest option requires a minimum of 1,000 custom entity recognition samples. Another option can be to use a CSV file that contains the annotations of the entity lists for the training dataset. The required format depends on the type of CSV file that we provide. In this post, we use the CSV entity lists option with two sample files:

The entity list labels_entity_list .csv
The sample dataset training_dataset.csv

To create the training job, we can use the Amazon Comprehend console, the AWS Command Line Interface (AWS CLI), or the Amazon Comprehend API. For this post, we use the API to programmatically create a training Lambda function using the AWS SDK for Python, as shown on GitHub.

The training process can take approximately 15 minutes. When the training process is complete, choose the recognizer and make a note of the recognizer ARN, which we use in the next step.

Run custom entity recognition inference

When the training job is complete, create an Amazon Comprehend analysis job using the console or APIs as shown on GitHub.

The process takes approximately 10 minutes, and again we need to make a note of the output job file.

Create an AWS Glue job to update the Data Catalog

Now that we have the Amazon Comprehend inference output, we can use the following AWS CLI command to create an AWS Glue job that updates the Data Catalog Comment fields for this dataset with customized data description.

Download the AWS Glue job script from the GitHub repo, upload to the S3 bucket created from the CF Stack in stage-1, and run the following AWS CLI command:

aws glue create-job 
--name "Glue_Comprehend_Job_custom_entity" 
--role "Lambda-S3-Glue-comprehend" 
--command '{"Name" : "pythonshell", "ScriptLocation" : "s3://<Your S3 bucket>/glue_comprehend_workflow_custom.py","PythonVersion":"3"}'
--default-arguments '{"--extra-py-files": "s3://aws-bigdata-blog/artifacts/simplify-data-discovery-for-business-users/blog/python/library/boto3-1.17.70-py2.py3-none-any.whl" }'

After you create the AWS Glue job, edit the job script and update the bucket and key name variables with the output data location of the Amazon Comprehend analysis jobs and run the AWS Glue job. See the following code:

bucket ="<Bucket Name>"
key = "comprehend_output/<Random number>output/output.tar.gz"

When the job is complete, it updates the Data Catalog with customized data descriptions.

Expose Data Catalog data to data consumers for search and discovery

Data consumers that prefer using SQL can use Athena to run queries against the information_schema.columns table, which includes the comment field of the Data Catalog. See the following code:

SELECT table_catalog,
         table_schema,
         table_name,
         column_name,
         data_type,
         comment
FROM information_schema.columns
WHERE comment LIKE '%customer%'
AND table_name = 'row_data_row_data'

The following screenshot shows our query results.

The query searches all schema columns that might have any data meanings that contain customer; it returns crating, which contains customer in the comment field.

BI authors can use text search instead of SQL to search for data meanings of data stored in an S3 data lake. This can be done by setting up a visual layer on top of Athena inside QuickSight.

QuickSight is scalable, serverless, embeddable, and machine learning (ML) powered BI tool that is deeply integrated with other AWS services.

BI development in QuickSight is organized as a stack of datasets, analyses, and dashboards. We start by defining a dataset from a list of various integrated data sources. On top of this dataset, we can design multiple analyses to uncover hidden insights and trends in the dataset. Finally, we can publish these analyses as dashboards, which is the consumable form that can be shared and viewed across different business lines and stakeholders.

We want to help the BI authors while designing analyses to get a better knowledge of the datasets they’re working on. To do so, we first need to connect to the data source where the metadata is stored, in this case the Athena table information_schema.columns, so we create a dataset to act as a Data Catalog view inside QuickSight.

QuickSight offers different modes of querying data sources, which is decided as part of the dataset creation process. The first mode is called direct query, in which the fetching query runs directly against the external data source. The second mode is a caching layer called QuickSight Super-fast Parallel In-memory Calculation Engine (SPICE), which improves performance when data is shared and retrieved by various BI authors. In this mode, the data is stored locally and can be reused multiple times, instead of running queries against the data source every time the data needs to be retrieved. However, as with all caching solutions, you must take data volume limits into consideration while choosing datasets to be stored in SPICE.

In our case, we choose to keep the Data Catalog dataset in SPICE, because the volume of the dataset is relatively small and won’t consume a lot of SPICE resources. However, we need to decide if we want to refresh the data cached in SPICE. The answer depends on how frequently the data schema and Data Catalog change, but in any case we can use the built-in scheduling within QuickSight to refresh SPICE at the desired interval. For information about triggering a refresh in an event-based manner, see Event-driven refresh of SPICE datasets in Amazon QuickSight.

After we create the Data Catalog view as a dataset inside QuickSight stored in SPICE, we can use row-level security to restrict the access to this dataset. Each BI author has access with respect to their privileges for columns they can view metadata for.

Next, we see how we can allow BI authors to search through data descriptions populated in the comment field of the Data Catalog dataset. QuickSight offers features like filters, parameters, and controls to add more flexibility into QuickSight analyses and dashboards.

Finally, we use the QuickSight capability to add more than one dataset within an analysis view to allow BI authors to switch between the metadata for the dataset and the actual dataset. This allows the BI authors to self-serve, reducing dependency on data platform engineers to decide which columns they should use in their analyses.

To set up a simple Data Catalog search and discovery inside QuickSight, complete the following steps:

On the QuickSight console, choose Datasets in the navigation pane.
Choose New dataset.
For New data sources, choose Amazon Athena.
Name the dataset Data Catalog.
Choose Create data source.
For Choose your table, choose Use custom SQL.
For Enter custom SQL query, name the query Data Catalog Query.
Enter the following query:

SELECT * FROM information_schema.columns

Choose Confirm query.
Select Import to QuickSight SPICE for quicker analytics.
Choose Visualize.

Next, we design an analysis on the dataset we just created to access the Data Catalog through Athena.

When we choose Visualize, we’re redirected to the QuickSight workspace to start designing our analysis.

Under Visual types, choose Table.
Under Fields list, add table_name, column_name, and comment to the Values field well.

Next, we use the filter control feature to allow users to perform text search for data descriptions.

In the navigation pane, choose Filter.
Choose the plus sign (+) to access the Create a new filter list.
On the list of columns, choose comment to be the filter column.
From the options menu (…) on the filter, choose Add to sheet.

We should be able to see a new control being added into our analysis to allow users to search the comment field.

Now we can start a text search for data descriptions that contain customer, where QuickSight shows the list of fields matching the search criteria and provides table and column names accordingly.

Alternatively, we can use parameters to be associated with the filter control if needed, for example to connect one dashboard to another. For more information, see the GitHub repo.

Finally, BI authors can switch between the metadata view that we just created and the actual Athena table view (row_all_row_data), assuming it’s already imported (if not, we can use the same steps from earlier to import the new dataset).

In the navigation pane, choose Visualize.
Choose the pen icon to add, edit, replace, or remove datasets.
Choose Add dataset.
Add row_all_row_data.
Choose Select.

BI authors can now switch between data and metadata datasets.

They now have a metadata view along with the actual data view, so they can better understand the meaning of each column in the dataset they’re working on, and they can read any comment that can be passed from other teams within the organization without needing to do this manually.

Conclusion

In this post, we showed how to build a quick workflow using AWS Glue and Amazon AI/ML services to complement the AWS Glue crawler functionality. You can integrate this workflow into a typical AWS Glue data cataloging and processing pipeline to achieve alignment between cross-functional teams by simplifying and automating the process of adding data descriptions in the Data Catalog. This is an important step in data discovery, and the topic will be covered more in upcoming posts.

This solution is also a step towards implementing data privacy and protection regimes such as the Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR) by identifying sensitive data types like PII and enforcing access polices.

You can find the source code from this post on GitHub and use it to build your own solution. For more information about NER models, see Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend.

About the Authors

Karim Hammouda is a Specialist Solutions Architect for Analytics at AWS with a passion for data integration, data analysis, and BI. He works with AWS customers to design and build analytics solutions that contribute to their business growth. In his free time, he likes to watch TV documentaries and play video games with his son.

Ahmed Raafat is a Senior Solutions Architect at Amazon Web Services, with a passion for machine learning solutions. Ahmed acts as a trusted advisor for many AWS enterprise customers to support and accelerate their cloud journey.

Richard Browning | Taking on Gravity | Talks at Google

2021-08-23 Talks at Google

Post Syndicated from Talks at Google original https://www.youtube.com/watch?v=NMnzr7DAvNI

[The Lost Bots] Bonus Episode: Velociraptor Contributor Competition

2021-08-23 Rapid7

Post Syndicated from Rapid7 original https://blog.rapid7.com/2021/08/23/the-lost-bots-bonus-episode-velociraptor-contributor-competition/

[The Lost Bots] Bonus Episode: Velociraptor Contributor Competition

Welcome back for a special bonus edition of The Lost Bots, a vlog series where Rapid7 Detection and Response Practice Advisor Jeffrey Gardner talks all things security with fellow industry experts. In this extra installment, Jeffrey chats with Mike Cohen, Digital Paleontologist for Velociraptor, an open source endpoint visibility tool that Rapid7 acquired earlier this year.

Mike fills us in on Velociraptor’s very first Contributor Competition, a friendly hackathon-style event that invites entrants to get their hands dirty and build the best extension to the Velociraptor platform that they can. Check out the episode to hear more about the competition, who’s judging, what they’re looking for, and what’s coming your way if you win — spoiler: there’s a cool $5,000 waiting for you if you nab the No. 1 spot, plus a range of other monetary and merchandise prizes. Jeffrey himself even plans to put his name in the ring!

[The Lost Bots] Bonus Episode: Velociraptor Contributor Competition

Stay tuned for future episodes of The Lost Bots! And don’t forget to start working on your entry for the 2021 Velociraptor Contributor Competition.

Happy 15th Birthday Amazon EC2

2021-08-23 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/happy-15th-birthday-amazon-ec2/

Fifteen years ago today I wrote the blog post that launched the Amazon EC2 Beta. As I recall, the launch was imminent for quite some time as we worked to finalize the feature set, the pricing model, and innumerable other details. The launch date was finally chosen and it happened to fall in the middle of a long-planned family vacation to Cabo San Lucas, Mexico. Undaunted, I brought my laptop along on vacation, and had to cover it with a towel so that I could see the screen as I wrote. I am not 100% sure, but I believe that I actually clicked Publish while sitting on a lounge chair near the pool! I spent the remainder of the week offline, totally unaware of just how much excitement had been created by this launch.

Preparing for the Launch
When Andy Jassy formed the AWS group and began writing his narrative, he showed me a document that proposed the construction of something called the Amazon Execution Service and asked me if developers would find it useful, and if they would pay to use it. I read the document with great interest and responded with an enthusiastic “yes” to both of his questions. Earlier in my career I had built and run several projects hosted at various colo sites, and was all too familiar with the inflexible long-term commitments and the difficulties of scaling on demand; the proposed service would address both of these fundamental issues and make it easier for developers like me to address widely fluctuating changes in demand.

The EC2 team had to make a lot of decisions in order to build a service to meet the needs of developers, entrepreneurs, and larger organizations. While I was not part of the decision-making process, it seems to me that they had to make decisions in at least three principal areas: features, pricing, and level of detail.

Features – Let’s start by reviewing the features that EC2 launched with. There was one instance type, one region (US East (N. Virginia)), and we had not yet exposed the concept of Availability Zones. There was a small selection of prebuilt Linux kernels to choose from, and IP addresses were allocated as instances were launched. All storage was transient and had the same lifetime as the instance. There was no block storage and the root disk image (AMI) was stored in an S3 bundle. It would be easy to make the case that any or all of these features were must-haves for the launch, but none of them were, and our customers started to put EC2 to use right away. Over the years I have seen that this strategy of creating services that are minimal-yet-useful allows us to launch quickly and to iterate (and add new features) rapidly in response to customer feedback.

Pricing – While it was always obvious that we would charge for the use of EC2, we had to decide on the units that we would charge for, and ultimately settled on instance hours. This was a huge step forward when compared to the old model of buying a server outright and depreciating it over a 3 or 5 year term, or paying monthly as part of an annual commitment. Even so, our customers had use cases that could benefit from more fine-grained billing, and we launched per-second billing for EC2 and EBS back in 2017. Behind the scenes, the AWS team also had to build the infrastructure to measure, track, tabulate, and bill our customers for their usage.

Level of Detail – This might not be as obvious as the first two, but it is something that I regularly think about when I write my posts. At launch time I shared the fact that the EC2 instance (which we later called the m1.small) provided compute power equivalent to a 1.7 GHz Intel Xeon processor, but I did not share the actual model number or other details. I did share the fact that we built EC2 on Xen. Over the years, customers told us that they wanted to take advantage of specialized processor features and we began to share that information.

Some Memorable EC2 Launches
Looking back on the last 15 years, I think we got a lot of things right, and we also left a lot of room for the service to grow. While I don’t have one favorite launch, here are some of the more memorable ones:

EC2 Launch (2006) – This was the launch that started it all. One of our more notable early scaling successes took place in early 2008, when Animoto scaled their usage from less than 100 instances all the way up to 3400 in the course of a week (read Animoto – Scaling Through Viral Growth for the full story).

Amazon Elastic Block Store (2008) – This launch allowed our customers to make use of persistent block storage with EC2. If you take a look at the post, you can see some historic screen shots of the once-popular ElasticFox extension for Firefox.

Elastic Load Balancing / Auto Scaling / CloudWatch (2009) – This launch made it easier for our customers to build applications that were scalable and highly available. To quote myself, “Amazon CloudWatch monitors your Amazon EC2 capacity, Auto Scaling dynamically scales it based on demand, and Elastic Load Balancing distributes load across multiple instances in one or more Availability Zones.”

Virtual Private Cloud / VPC (2009) – This launch gave our customers the ability to create logically isolated sets of EC2 instances and to connect them to existing networks via an IPsec VPN connection. It gave our customers additional control over network addressing and routing, and opened the door to many additional networking features over the coming years.

Nitro System (2017) – This launch represented the culmination of many years of work to reimagine and rebuild our virtualization infrastructure in pursuit of higher performance and enhanced security (read more).

Graviton (2018) -This launch marked the launch of Amazon-built custom CPUs that were designed for cost-sensitive scale-out workloads. Since that launch we have continued this evolutionary line, launching general purpose, compute-optimized, memory-optimized, and burstable instances powered by Graviton2 processors.

Instance Types (2006 – Present) -We launched with one instance type and now have over four hundred, each one designed to empower our customers to address a particular use case.

Celebrate with Us
To celebrate the incredible innovation we’ve seen from our customers over the last 15 years, we’re hosting a 2-day live event on August 23rd and 24th covering a range of topics. We kick off the event with today at 9am PDT with Vice President of Amazon EC2 Dave Brown’s keynote “Lessons from 15 Years of Innovation.

Event Agenda

August 23	August 24
Lessons from 15 Years of Innovation	AWS Everywhere: A Fireside Chat on Hybrid Cloud
15 Years of AWS Silicon Innovation	Deep Dive on Real-World AWS Hybrid Examples
Choose the Right Instance for the Right Workload	AWS Outposts: Extending AWS On-Premises for a Truly Consistent Hybrid Experience
Optimize Compute for Cost and Capacity	Connect Your Network to AWS with Hybrid Connectivity Solutions
The Evolution of Amazon Virtual Private Cloud	Accelerating ADAS and Autonomous Vehicle Development on AWS
Accelerating AI/ML Innovation with AWS ML Infrastructure Services	Accelerate AI/ML Adoption with AWS ML Silicon
Using Machine Learning and HPC to Accelerate Product Design	Digital Twins: Connecting the Physical to the Digital World

— Jeff;

[$] The Btrfs inode-number epic (part 2: solutions)

2021-08-23

Post Syndicated from original https://lwn.net/Articles/866709/rss

The first installment in this two-part
series looked at the difficulties that arise when Btrfs filesystems
containing subvolumes are exported via NFS. Btrfs has a couple of quirks
that complicate life in this situation: the use of separate device numbers
for subvolumes and the lack of unique inode numbers across the filesystem
as a whole. Recently, Neil Brown set off on an effort to try
to solve these problems, only to discover that the situation was even
more difficult than expected and that many attempts would be required.

Adding resiliency to AWS CloudFormation custom resource deployments

2021-08-23 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/adding-resiliency-to-aws-cloudformation-custom-resource-deployments/

This post is written by Dathu Patil, Solutions Architect and Naomi Joshi, Cloud Application Architect.

AWS CloudFormation custom resources allow you to write custom provisioning logic in templates. These run anytime you create, update, or delete stacks. Using AWS Lambda-backed custom resources, you can associate a Lambda function with a CloudFormation custom resource. The function is invoked whenever the custom resource is created, updated, or deleted.

When CloudFormation asynchronously invokes the function, it passes the request data, such as the request type and resource properties to the function. The customizability of Lambda functions in combination with CloudFormation allow a wide range of scenarios. For example, you can dynamically look up Amazon Machine Image (AMI) IDs during stack creation or use utilities such as string reversal functions.

Unhandled exceptions or transient errors in the custom resource Lambda function can cause your code to exit without sending a response. CloudFormation requires an HTTPS response to confirm if the operation is successful or not. An unreported exception causes CloudFormation to wait until the operation times out before starting a stack rollback.

If the exception occurs again on rollback, CloudFormation waits for a timeout exception before ending in a rollback failure. During this time, your stack is unusable. You can learn more about this and best practices by reviewing Best Practices for CloudFormation Custom Resources.

In this blog, you learn how you can use Amazon SQS and Lambda to add resiliency to your Lambda-backed CloudFormation custom resource deployments. The example shows how to use CloudFormation custom resource to look up an AMI ID dynamically during Amazon EC2 creation.

Overview

CloudFormation templates that declare an EC2 instance must also specify an AMI ID. This includes an operating system and other software and configuration information used to launch the instance. The correct AMI ID depends on the instance type and Region in which you’re launching your stack. AMI ID can change regularly, such as when an AMI is updated with software updates.

Customers often implement a CloudFormation custom resource to look up an AMI ID while creating an EC2 instance. In this example, the lookup Lambda function calls the EC2 API. It fetches the available AMI IDs, uses the latest AMI ID, and checks for a compliance tag. This implementation assumes that there are separate processes for creating AMI and running compliance checks. The process that performs compliance and security checks creates a compliance tag on a successful scan.

This solution shows how you can use SQS and Lambda to add resiliency to handle an exception. In this case, the exception occurs in the AMI lookup custom resource due to a missing compliance tag. When the AMI lookup function fails processing, it uses the Lambda destination configuration to send the request to an SQS queue. The message is reprocessed using the SQS queue and Lambda function.

The CloudFormation custom resource asynchronously invokes the AMI lookup Lambda function to perform appropriate actions.
The AMI lookup Lambda function calls the EC2 API to fetch the list of AMIs and checks for a compliance tag. If the tag is missing, it throws an unhandled exception.
On failure, the Lambda destination configuration sends the request to the retry queue that is configured as a dead-letter queue (DLQ). SQS adds a custom delay between retry processing to support more than two retries.
The retry Lambda function processes messages in the retry queue using Lambda with SQS. Lambda polls the queue and invokes the retry Lambda function synchronously with an event that contains queue messages.
The retry function then synchronously invokes the AMI lookup function using the information from the request SQS message.

The AMI Lookup Lambda function

An AWS Serverless Application Model (AWS SAM) template is used to create the AMI lookup Lambda function. You can configure async event options such as number of retries on the Lambda function. The maximum retries allowed is 2 and there is no option to set a delay between the invocation attempts.

When a transient failure or unhandled error occurs, the request is forwarded to the retry queue. This part of the AWS SAM template creates AMI lookup Lambda function:

  AMILookupLambda:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: amilookup/
      Handler: app.lambda_handler
      Runtime: python3.8
      Timeout: 300
      EventInvokeConfig:
          MaximumEventAgeInSeconds: 60
          MaximumRetryAttempts: 2
          DestinationConfig:
            OnFailure:
              Type: SQS
              Destination: !GetAtt RetryQueue.Arn
      Policies:
        - AMIDescribePolicy: {}

This function calls the EC2 API using the boto3 AWS SDK for Python. It calls the describe_images method to get a list of images with given filter conditions. The Lambda function iterates through the AMI list and checks for compliance tags. If the tag is not present, it raises an exception:

ec2_client = boto3.client('ec2', region_name=region)
         # Get AMI IDs with the specified name pattern and owner
         describe_response = ec2_client.describe_images(
            Filters=[{'Name': "name", 'Values': architectures},
                     {'Name': "tag-key", 'Values': ['ami-compliance-check']}],
            Owners=["amazon"]
        )

The queue and the retry Lambda function

The retry queue adds a 60-second delay before a message is available for the processing. The time delay between retry processing attempts provides time for transient errors to be corrected. This is the AWS SAM template for creating these resources:

RetryQueue:
  Type: AWS::SQS::Queue
  Properties:
    VisibilityTimeout: 60
    DelaySeconds: 60
    MessageRetentionPeriod: 600

RetryFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: retry/
      Handler: app.lambda_handler
      Runtime: python3.8
      Timeout: 60
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RetryQueue.Arn
            BatchSize: 1
      Policies:
        - LambdaInvokePolicy:
            FunctionName: !Ref AMILookupFunction

The retry Lambda function periodically polls for new messages in the retry queue. The function synchronously invokes the AMI lookup Lambda function. On success, a response is sent to CloudFormation. This process runs until the AMI lookup function returns a successful response or the message is deleted from the SQS queue. The deletion is based on the MessageRetentionPeriod, which is set to 600 seconds in this case.

for record in event['Records']:
        body = json.loads(record['body'])
        response = client.invoke(
            FunctionName=body['requestContext']['functionArn'],
            InvocationType='RequestResponse',
            Payload=json.dumps(body['requestPayload']).encode()
        )

Deployment walkthrough

Prerequisites

To get started with this solution, you need:

AWS CLI and AWS SAM CLI installed to deploy the solution.
An existing Amazon EC2 public image. You can choose any of the AMIs from the AWS Management Console with Architecture = X86_64 and Owner = amazon for test purposes. Note the AMI ID.

Download the source code from the resilient-cfn-custom-resource GitHub repository. The template.yaml file is an AWS SAM template. It deploys the Lambda functions, SQS, and IAM roles required for the Lambda function. It uses Python 3.8 as the runtime and assigns 128 MB of memory for the Lambda functions.

To build and deploy this application using the AWS SAM CLI build and guided deploy:
```
sam build --use-container
sam deploy --guided
```

The custom resource stack creation invokes the AMI lookup Lambda function. This fetches the AMI ID from all public EC2 images available in your account with the tag ami-compliance-check. Typically, the compliance tags are created by a process that performs security scans.

In this example, the security scan process is not running and the tag is not yet added to any AMIs. As a result, the custom resource throws an exception, which goes to the retry queue. This is retried by the retry function until it is successfully processed.

Use the console or AWS CLI to add the tag to the chosen EC2 AMI. In this example, this is analogous to a separate governance process that checks for AMI compliance and adds the compliance tag if passed. Replace the $AMI-ID with the AMI ID captured in the prerequisites:
```
aws ec2 create-tags –-resources $AMI-ID --tags Key=ami-compliance-check,Value=True
```
After the tags are added, a response is sent successfully from the custom resource Lambda function to the CloudFormation stack. It includes your $AMI-ID and a test EC2 instance is created using that image. The stack creation completes successfully with all resources deployed.

Conclusion

This blog post demonstrates how to use SQS and Lambda to add resiliency to CloudFormation custom resources deployments. This solution can be customized for use cases where CloudFormation stacks have a dependency on a custom resource.

CloudFormation custom resource failures can happen due to unhandled exceptions. These are caused by issues with a dependent component, internal service, or transient system errors. Using this solution, you can handle the failures automatically without the need for manual intervention. To get started, download the code from the GitHub repo and start customizing.

For more serverless learning resources, visit Serverless Land.

Security updates for Monday

2021-08-23

Post Syndicated from original https://lwn.net/Articles/867149/rss

Security updates have been issued by Debian (ffmpeg, ircii, and scrollz), Fedora (kernel, krb5, libX11, and rust-actix-http), Mageia (kernel and kernel-linus), openSUSE (aspell, chromium, dbus-1, isync, java-1_8_0-openjdk, krb5, libass, libhts, libvirt, prosody, systemd, and tor), SUSE (cpio, dbus-1, libvirt, php7, qemu, and systemd), and Ubuntu (inetutils).

Rapid7 MDR Named a Market Leader, Again!

2021-08-23 Jake Godgart

Post Syndicated from Jake Godgart original https://blog.rapid7.com/2021/08/23/rapid7-mdr-named-a-market-leader-again/

Rapid7 MDR Named a Market Leader, Again!

New IDC MarketScape Names Rapid7 a Leader in U.S. Managed Detection and Response (MDR)

It’s a big year to be named a Leader.

Time magazine said the pandemic produced “the world’s largest work-from-home experiment.” Suddenly, everyone was accessing everything from everywhere. Control moved outside security’s four walls. More stuff moved to the cloud. And CEOs started wondering who’d be on the nightly news next explaining why they paid millions to EvilCorp hackers.

So this year, especially, Rapid7 is thrilled to be recognized as a Leader in the IDC MarketScape: Managed Detection and Response 2021 Vendor Assessment, (Doc #US48129921, August 2021).

Rapid7 MDR Named a Market Leader, Again!

This IDC MarketScape report shows an unbiased look at 15 MDR players in the U.S. market, evaluating each on capabilities. We feel this recognition reflects Rapid7’s mission to help our customers close the security achievement gap — because every company, regardless of their security team’s size, deserves a level playing field against attackers. Clearly we’re on the right path.

This recognition follows a slew of other accolades for Rapid7’s Detection and Response portfolio. In the last few months, Forrester Research recognized Rapid7 as a “Leader” (Mid-size MSSP Wave, Q3 2020) and “Strong Performer” (MDR Wave, Q1 2021). And Gartner recognized the underlying technology of the MDR service — InsightIDR — as a “Leader” for the third year in a row (SIEM Magic Quadrant, Q2 2021).

Why is this so important?

Nowadays, the MDR market is so noisy that all vendors can sound the same. When market reports like this are published, it proves there’s a difference between MDR providers and offerings delivering security outcomes versus promises.

Today, Rapid7 MDR security experts use our XDR technology to provide constant coverage across our customer’s modern environment — endpoints, users, network, and the cloud. Attackers can change their tactics, but Rapid7’s threat engine still lets us stay a step ahead.

IDC analysts like that Rapid7 MDR “applies proprietary threat intelligence and knowledge from the Metasploit and Velociraptor open-source communities.” This proprietary, community-infused threat intelligence, combined with our recent IntSights acquisition, will evolve our service with even more accurate detections across both internal and external attack surfaces. Attackers have nowhere to hide.

And unlike other MDR and MSSP services that rely on security generalists to simply manage technology and triage alerts, Rapid7’s expert specialists take the lead on threat detection, validation, and how to respond.

Your team can stop threats earlier and respond faster. You can have the confidence that your environment is monitored 24×7. And you’ll have time to focus on what matters most (even if some days it’s just getting around to taking lunch).

Learn more about Rapid7 MDR

Teams love Rapid7 MDR, and here’s why. We help you:

Build your cyber resilience: You can detect threats with confidence. Our team delivers the answers needed to find and stop attacks, not just deliver alerts. And we’ll partner with your team to strengthen your security program.
Enable you to scale with SecOps experts: 24×7 is table stakes now. But having continuous coverage by breach response analysts isn’t. Customers can collaborate with Rapid7 security advisors and get the incident response help needed if (or when) it’s needed most.
Provide full transparency into operations: You see what we see with full access to the technology our analysts use. Learn from our experts and community. Then prove out the ROI with comprehensive reporting that even your CFO would appreciate.
Catch attackers with 24×7 XDR technology: Unify and transform relevant security data from across endpoints, users, network traffic, and the cloud to detect and respond to attackers wherever they are.
Achieve a rapid time-to-value: Jumpstart detection and response from day one. We’ll provide you with the guidance and advice to move from risk to remediation and strengthen your cyber resilience.

Looking for a new MDR provider? Let’s talk.

Speak to an expert

View logs, just a click away

console.log(), and you’re all set

Advanced filters, from your terminal

Try it out!

1. The massive gap between average and median breach sales prices

2. The numerical dominance of tech and telecoms victims

3. The low proportion of retail and hospitality victims

Learn more about compromised network access sales

Get your copy of Wireframe issue 53

Architectural overview

Authenticating users with Amazon Cognito

Validating the JSON Web Token with API Gateway

Storing user data with DynamoDB

Generating the PDF and user-specific watermark

Alternate approach

Extending this solution

Conclusion

Architecture Overview

Comparing gRPC and JSON for microservices

gRPC for .NET developers

Configuring ALB for gRPC

Deploy the Tutorial

Prerequisites

Deploy the Track Order microservice

Deploy the Submit Order microservice

Data flow diagram

Test the microservices

Run the mobile app

Cleaning up

Conclusion

Matt Cline

Ulili Nhaga

Getting started

Solution

Ingestion layer

Processing layer

Storage layer

Analytics layer

Machine learning layer

Reinforcement layer

Conclusion

About the Authors

Design pattern for third-party integration with SQS

Reference implementation for third-party integration with SQS

Conclusion

Data democratization and the need for self-service BI

Solution overview

Solution details

Stage-one: Enrich the Data Catalog with a standard built-in Amazon Comprehend entity detector

Stage-two: Use Amazon Comprehend custom entity recognition

Train an Amazon Comprehend custom entity recognition model

Run custom entity recognition inference

Create an AWS Glue job to update the Data Catalog

Expose Data Catalog data to data consumers for search and discovery

Conclusion

About the Authors

Overview

The AMI Lookup Lambda function

The queue and the retry Lambda function

Deployment walkthrough

Prerequisites

Conclusion

Why is this so important?

Looking for a new MDR provider? Let’s talk.

The collective thoughts of the interwebz