Tag Archives: Architecture

Leveraging AWS Global Backbone for Data Center Migration and Global Expansion

Post Syndicated from Santiago Freitas original https://aws.amazon.com/blogs/architecture/leveraging-aws-global-backbone-for-data-center-migration-and-global-expansion/

Many companies run their applications in data centers, server rooms or in space rented from colocation providers in multiple countries. Those companies usually have a mixture of a small number of central large data centers where their core systems are hosted in several smaller, regional data centers. These offices in the multiple countries require access to applications running in the local data centers, usually in the same country, as well as to applications running in the remote data centers. Companies have taken the approach of establishing a self-managed, international wide area network (WAN) or contracting it as a service from a telecommunications provider to enable connectivity between the different sites. As customers migrate workloads to AWS Regions, they need to maintain connectivity between their offices, AWS Regions, and existing on-premises data centers.

This blog post discusses architectures applicable for international data center migrations as well as to customers expanding their business to new countries. The proposed architectures enable access to both AWS and on-premises hosted applications. These architectures leverage the AWS global backbone for connectivity between customer sites in different countries and even continents.

Let’s look into a use case where a customer has their central data center that hosts its core systems located in London, United Kingdom. The customer has rented space from a colocation provider in Mumbai to run applications required to be hosted in India. They have an office in India where users need access to the applications running in their Mumbai data center as well as the core systems running in their London data center. Those different sites are interconnected by a global WAN as illustrated on the diagram below.

Initial architecture with a global WAN interconnecting customer’s sites

Figure 1: Initial architecture with a global WAN interconnecting customer’s sites

The customer then migrates their applications from their Mumbai data center to the AWS Mumbai region. Users from the customer’s offices in India require access to applications running in the AWS Mumbai Region as well as the core systems running in their London data center. To enable access to the applications hosted in the AWS Mumbai Region, the customer established a connection from their India offices to the AWS Mumbai region. These connections can leverage AWS Direct Connect (DX) or an AWS Site-to-Site VPN. We will also use AWS Transit Gateway (TGW) which allows for customer traffic to transit through AWS infrastructure. For the customer sites using AWS Direct Connect, we attach an AWS Transit Gateway to a Direct Connect gateway (DXGW) to enable customers to manage a single connection for multiple VPCs or VPNs that are in the same region. To optimize their WAN cost, the customer leverages AWS Transit Gateway inter-region peering capability to connect their AWS Transit Gateway in the AWS Mumbai region to their AWS Transit Gateway in the AWS London region. Traffic using inter-region Transit Gateway peering is always encrypted, stays on the AWS global network, and never traverses the public Internet. Transit Gateway peering enables international, in this case intercontinental, communication. Once the traffic arrives at the London region’s Transit Gateway, the customer routes the traffic over an AWS Direct Connect (or VPN) to the central data center, where core systems are hosted.

As applications are migrated from the central data center in London to the AWS London Region, users from India office are able to seamlessly access applications hosted in the AWS London region and on-premises. The architecture below demonstrates the traffic between the customer sites and also from a customer site to a local and a remote AWS Region.

Access from customer sites to applications in AWS regions and on-premises via AWS Global Network

Figure 2: Access from customer sites to applications in AWS regions and on-premises via AWS Global Network

As the customer expands internationally, the architecture evolves to allow access from new international offices such as in Sydney and Singapore to the other customer sites as well as to AWS regions via the AWS Global Network. Depending on the bandwidth requirements, a customer can use AWS DX to the nearest AWS region and then leverage AWS Transit Gateway inter-region peering, as demonstrated on the diagram below for the Singapore site. For sites where a VPN-based connection meets the bandwidth and user experience requirements, the customer can leverage accelerated site-to-site VPN using AWS Global Accelerator, as illustrated for the Sydney office. This architecture allows thousands of sites to be interconnected and use the AWS global network to access applications running on-premises or in AWS.

Global connectivity facilitated by AWS Global Network

Figure 3: Global connectivity facilitated by AWS Global Network

Considerations

The following are some of the characteristic customers should consider when adopting the architectures described in this blog post.

  • You have a fixed hourly cost of TGW attachments, VPN and DX connections.
  • There is also a variable usage-based component that depends on the amount of traffic that flows through TGW, OUT of AWS, and inter-region.
  • In comparison, a fixed price model is often offered by telecommunications providers for the entire network.

For customers with a high number of sites in the same geographical area, consider setting up a regional WAN. This could be done with SD-WAN technologies or private WAN connections. A regional WAN is used to interconnect different sites with nearest AWS region also connected to the regional WAN. Such design uses the AWS global network for international connectivity and a regional WAN for regional connectivity between customer sites.

Conclusion

As customers migrate their applications to AWS, they can leverage the AWS global network to optimize their WAN architecture and associated costs. Leveraging TGW inter-region peering enable customers to build architectures which facilitate data center migration as well as international business expansion, while allowing access to workloads running either on-premises or in AWS regions. For a list of AWS regions where TGW inter-region peering is supported, please refer to the AWS Transit Gateway FAQ.

Serverless Architecture for a Web Scraping Solution

Post Syndicated from Dzidas Martinaitis original https://aws.amazon.com/blogs/architecture/serverless-architecture-for-a-web-scraping-solution/

If you are interested in serverless architecture, you may have read many contradictory articles and wonder if serverless architectures are cost effective or expensive. I would like to clear the air around the issue of effectiveness through an analysis of a web scraping solution. The use case is fairly simple: at certain times during the day, I want to run a Python script and scrape a website. The execution of the script takes less than 15 minutes. This is an important consideration, which we will come back to later. The project can be considered as a standard extract, transform, load process without a user interface and can be packed into a self-containing function or a library.

Subsequently, we need an environment to execute the script. We have at least two options to consider: on-premises (such as on your local machine, a Raspberry Pi server at home, a virtual machine in a data center, and so on) or you can deploy it to the cloud. At first glance, the former option may feel more appealing — you have the infrastructure available free of charge, why not to use it? The main concern of an on-premises hosted solution is the reliability — can you assure its availability in case of a power outage or a hardware or network failure? Additionally, does your local infrastructure support continuous integration and continuous deployment (CI/CD) tools to eliminate any manual intervention? With these two constraints in mind, I will continue the analysis of the solutions in the cloud rather than on-premises.

Let’s start with the pricing of three cloud-based scenarios and go into details below.

Pricing table of three cloud-based scenarios

*The AWS Lambda free usage tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. Review AWS Lambda pricing.

Option #1

The first option, an instance of a virtual machine in AWS (called Amazon Elastic Cloud Compute or EC2), is the most primitive one. However, it definitely does not resemble any serverless architecture, so let’s consider it as a reference point or a baseline. This option is similar to an on-premises solution giving you full control of the instance, but you would need to manually spin an instance, install your environment, set up a scheduler to execute your script at a specific time, and keep it on for 24×7. And don’t forget the security (setting up a VPC, route tables, etc.). Additionally, you will need to monitor the health of the instance and maybe run manual updates.

Option #2

The second option is to containerize the solution and deploy it on Amazon Elastic Container Service (ECS). The biggest advantage to this is platform independence. Having a Docker file (a text document that contains all the commands you could call on the command line to assemble an image) with a copy of your environment and the script enables you to reuse the solution locally—on the AWS platform, or somewhere else. A huge advantage to running it on AWS is that you can integrate with other services, such as AWS CodeCommit, AWS CodeBuild, AWS Batch, etc. You can also benefit from discounted compute resources such as Amazon EC2 Spot instances.

Architecture of CloudWatch, Batch, ECR

The architecture, seen in the diagram above, consists of Amazon CloudWatch, AWS Batch, and Amazon Elastic Container Registry (ECR). CloudWatch allows you to create a trigger (such as starting a job when a code update is committed to a code repository) or a scheduled event (such as executing a script every hour). We want the latter: executing a job based on a schedule. When triggered, AWS Batch will fetch a pre-built Docker image from Amazon ECR and execute it in a predefined environment. AWS Batch is a free-of-charge service and allows you to configure the environment and resources needed for a task execution. It relies on ECS, which manages resources at the execution time. You pay only for the compute resources consumed during the execution of a task.

You may wonder where the pre-built Docker image came from. It was pulled from Amazon ECR, and now you have two options to store your Docker image there:

  • You can build a Docker image locally and upload it to Amazon ECR.
  • You just commit few configuration files (such as Dockerfile, buildspec.yml, etc.) to AWS CodeCommit (a code repository) and build the Docker image on the AWS platform.This option, shown in the image below, allows you to build a full CI/CD pipeline. After updating a script file locally and committing the changes to a code repository on AWS CodeCommit, a CloudWatch event is triggered and AWS CodeBuild builds a new Docker image and commits it to Amazon ECR. When a scheduler starts a new task, it fetches the new image with your updated script file. If you feel like exploring further or you want actually implement this approach please take a look at the example of the project on GitHub.

CodeCommit. CodeBuild, ECR

Option #3

The third option is based on AWS Lambda, which allows you to build a very lean infrastructure on demand, scales continuously, and has generous monthly free tier. The major constraint of Lambda is that the execution time is capped at 15 minutes. If you have a task running longer than 15 minutes, you need to split it into subtasks and run them in parallel, or you can fall back to Option #2.

By default, Lambda gives you access to standard libraries (such as the Python Standard Library). In addition, you can build your own package to support the execution of your function or use Lambda Layers to gain access to external libraries or even external Linux based programs.

Lambda Layer

You can access AWS Lambda via the web console to create a new function, update your Lambda code, or execute it. However, if you go beyond the “Hello World” functionality, you may realize that online development is not sustainable. For example, if you want to access external libraries from your function, you need to archive them locally, upload to Amazon Simple Storage Service (Amazon S3), and link it to your Lambda function.

One way to automate Lambda function development is to use AWS Cloud Development Kit (AWS CDK), which is an open source software development framework to model and provision your cloud application resources using familiar programming languages. Initially, the setup and learning might feel strenuous; however the benefits are worth of it. To give you an example, please take a look at this Python class on GitHub, which creates a Lambda function, a CloudWatch event, IAM policies, and Lambda layers.

In a summary, the AWS CDK allows you to have infrastructure as code, and all changes will be stored in a code repository. For a deployment, AWS CDK builds an AWS CloudFormation template, which is a standard way to model infrastructure on AWS. Additionally, AWS Serverless Application Model (SAM) allows you to test and debug your serverless code locally, meaning that you can indeed create a continuous integration.

See an example of a Lambda-based web scraper on GitHub.

Conclusion

In this blog post, we reviewed two serverless architectures for a web scraper on AWS cloud. Additionally, we have explored the ways to implement a CI/CD pipeline in order to avoid any future manual interventions.

Architecture Monthly Magazine: Media & Entertainment

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/architecture-monthly-magazine-media-entertainment/

AWS Architecture Monthly - June 2020 - Media &EntertainmentAWS makes it fast and easy for the Media and Entertainment (M&E) industry to produce, process, and deliver broadcast and over-the-top video. These pay-as-you-go services and appliance products offer the video infrastructure you need to deliver great viewing experiences to any screen.

For June’s issue of AWS Architecture Monthly, WW Tech Leader Konstantin Wilms talks about industry and architectural pattern trends in the cloud-based M&E space, and explains why this industry is unique in that almost any business problem can be solved using many AWS services. He also offers advice on what prospective customers should be thinking about and considering when moving to AWS.

In June’s Media & Entertainment issue

  • Ask an Expert: Konstantin Wilms, WW Tech Leader, AWS M&E
  • Blog: Deploying Your Favorite Post Production Applications on AWS Virtual Desktop Infrastructure
  • Case Study: L Benfica: Launching a Full-Featured VOD Platform in Just Four Weeks
  • Quick Start: Cloud Video Editing on AWS
  • Tutorial: Reimagine Your Studio
  • Whitepaper: Building Media & Entertainment Predictive Analytics Solutions on AWS

How to access the magazine

We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us star rating and comment on the Amazon Kindle Newsstand page or contact us anytime at [email protected].

Getting Ready for Your AWS Certified Solutions Architect – Associate Exam

Post Syndicated from Ed Van Sickle original https://aws.amazon.com/blogs/architecture/getting-ready-for-your-aws-certified-solutions-architect-associate-exam/

AWS Training and CertificationSo, you decided you want to earn your AWS Certified Solutions Architect – Associate certification. Good decision! Maybe you’ve read through our exam preparation recommendations (or maybe just opened that link to do so), explored our Architect Learning Path, and are now contemplating next steps.

You may be thinking:

  • I know myself well enough to know that if I’m going to get this done on time, I need a class with an instructor on my calendar.
  • Taking Architecting on AWS and an exam readiness course seems like a good idea, but can I schedule those together?
  • Note to self: take a practice exam or two.

If these resonate, our new five-day classroom training course, Exam Readiness Intensive Workshop: AWS Certified Solutions Architect – Associate was built with you in mind. This course combines expert, AWS-accredited instructor led training and exam readiness with AWS service deep-dives and quizzes exclusive to this course. This intensive, focused approach has already helped APN Partners and customers get ready for their certification exams on their way to achieving their AWS Certification goals.

Dive deep with architectural best practices

The course starts with learning the fundamentals of building an IT infrastructure on AWS. You’ll spend three days on the Architecting on AWS course, which builds that architectural mindset and foundation. Throughout the course, you’ll get in a “getting ready for my exam” mindset with end-of-module quizzes to help reinforce learning and provide practice in answering test questions.

Complement architectural best practices with additional AWS services and exam taking skills

Day four provides additional deep dives on AWS services not covered in Architecting on AWS and a half-day dedicated to the Exam Readiness: AWS Certified Solutions Architect – Associate course. You’ll review sample exam questions in each topic area and learn how to interpret the concepts being tested, so that you can more easily eliminate incorrect responses. At the end of the day, you’ll put your knowledge to the test with the first of two quizzes with guided reviews with your instructor.

Test your knowledge

Day five starts with a guided review of the practice exam. You’ll get to see if your answers are right, and if not, have an opportunity to engage the instructor to understand why your answers are wrong, and how you could have eliminated some of the options. You’ll finish day five with one more quiz to test your knowledge. With instructor-guided reviews of two quizzes and a practice exam, you can gain a boost of confidence at the end of the course before sitting for the exam.

The right course for the right candidate

This course is tailored for the individual who wants to combine learning architectural concepts and exam-taking skills from an instructor. If that’s you, find a class today delivered by AWS or one of our AWS APN Training Partners.

IoT All the Things (Ask Me Anything Edition)

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/iot-all-the-things-ask-me-anything-edition/

Join AWS Internet of Things (IoT) experts as they walk you through building IoT applications with AWS live on Twitch. Along the way, AWS IoT customers and partners will co-host, share best tips and tricks, and make sure all of your questions are answered. By the end of each episode, you’ll learn how to build IoT applications across industrial, connected home, and everything in between.

In this special episode, our hosts, Rudy Chetty and Wale Oladehin, go all out in an ask-me-anything style and dive into audience-prompted topics including IoT operations, IoT best practices, device management, mobile development, analytics, and more. From AWS IoT Greengrass to AWS IoT SiteWise to AWS IoT Analytics, get ready to IoT All the Things across a wide range of IoT software and services in our Season 2, Episode 4 “IoT All the Things, AMA Edition.”

When we asked Rudy about his experience filming this special episode, he had this to say:

“I love the spontaneity of live-streaming. It’s a really interesting space to be able to be challenged by customers through questions. You never know what you’ll be asked, and you have to be on your toes constantly, so to speak. Case in point: I never thought I’d be discussing weevils and how to detect them in rice silos but that’s exactly what a customer asked about. Furthermore, it’s fascinating to not only be able to dive into a solution with them, but also see what use cases they dream up for AWS IoT.”

Check out more IoT All the Things videos on Twitch.

Measure Effectiveness of Virtual Training in Real Time with AWS AI Services

Post Syndicated from Rajeswari Malladi original https://aws.amazon.com/blogs/architecture/measure-effectiveness-of-virtual-training-in-real-time-with-aws-ai-services/

As per International Data Corporation (IDC), worldwide spending on digital transformation will reach $2.3 trillion in 2023. As organizations adopt digital transformation, training becomes an important aspect of this journey. Whether these are internal trainings to upskill existing workforce or a packaged content for commercial use, these trainings need to be efficient and cost effective. With the advent of education technology, it is a common practice to deliver trainings via digital platforms. This makes it accessible for larger population and is cost effective, but it is important that the trainings are interactive and effective. According to  a recent article published by Forbes, immersive education and data driven insights are among the top five Education Technology (EdTech) innovations. These are the key characteristics of creating an effective training experience.

An earlier blog series explored how to build a virtual trainer on AWS using Amazon Sumerian. This series illustrated how to easily build an immersive and highly engaging virtual training experience without needing additional devices or a complex virtual reality platform management. These trainings are easy to maintain and are cost effective.

In this blog post, we will further extend the architecture to gather real-time feedback about the virtual trainings and create data-driven insights to measure its effectiveness with the help of Amazon artificial intelligence (AI) services.

Architecture and its benefits

Virtual training on AWS and AI Services - Architecture

Virtual training on AWS and AI Services – Architecture

Consider a scenario where you are a vendor in the health care sector. You’ve developed a cutting-edge device, such as patient vital monitoring hardware that goes through frequent software upgrades and it is about to be rolled out across different U.S. hospitals. The nursing staff needs to be well trained before it can begin using the device. Let’s take a look at an architecture to solve this problem. We will first explain the architecture for building the training and then we will show how we can measure its effectiveness.

At the core of the architecture is Amazon Sumerian. Sumerian is a managed service that lets you create and run 3D, Augmented Reality (AR), and Virtual Reality (VR) applications. Within Sumerian, real-life scenes from a hospital environment can be created by importing the assets from the assets library. Scenes consist of host(s) and an AI-driven animated character with built-in animation, speech, and behavior. The hosts act as virtual trainers that interact with the nursing staff. The speech component assigns text to the virtual trainer for playback with Amazon Polly. Polly helps convert training content from Sumerian to life-like speech in real time and ensures the nursing staff receives the latest content related to the equipment on which it’s being trained.

The nursing staff accesses the training via web browsers on iOS or Android mobile devices or laptops, and authenticates using Amazon Cognito. Cognito is a service that lets you easily add user sign-up and authentication to your mobile and web apps. Sumerian then uses the Cognito identity pool to create temporary credentials to access AWS services.

The flow of the interactions within Sumerian is controlled using a visual state machine in the Sumerian editor. Within the editor, the dialogue component assigns an Amazon Lex chatbot to an entity, in this case the virtual trainer or host. Lex is a service for building conversational interfaces with voice and text. It provides you the ability to have interactive conversations with the nursing staff, understand its areas of interest, and deliver appropriate training material. This is an important aspect of the architecture where you can customize the training per users’ needs.

Lex has native interoperability with AWS Lambda, a serverless compute offering where you just write and run your code in Lambda functions. Lambda can be used to validate user inputs or apply any business logic, such as fetching the user selected training material from Amazon DynamoDB (or another database) in real time. This material is then delivered to Lex as a response to user queries.

You can extend the state machine within the Sumerian editor to introduce new interactive flows to collect user feedback. Amazon Lex collects user feedback, which is saved in Amazon Simple Storage Service (S3) and analyzed by Amazon Comprehend. Amazon Comprehend is a natural language processing service that uses AI to find meaning and insights/sentiments in text. Insights from user feedback are stored in S3, which is a highly scalable, durable, and highly available object storage.

You can analyze the insights from user feedback using Amazon Athena, an interactive query service which analyzes data in S3 using standard SQL. You can then easily build visualizations using Amazon QuickSight.

By using this architecture, you not only deliver the virtual training to your nursing staff in an immersive environment created by Amazon Sumerian, but you can also gather the feedback interactively. You can gain insights from this feedback and iterate over it to make the training experience more effective.

Conclusion and next steps

In this blog post we reviewed the architecture to build interactive trainings and measure their effectiveness. The serverless nature of this architecture makes it cost effective, agile, and easy to manage, and you can apply it to a number of use cases. For example, an educational institution can develop training content designed for multiple learning levels and the training level can be adjusted in real time based on live interactions with the students. In the manufacturing scenario, you can build a digital twin of your process and train your resources to handle different scenarios with full interactions. You can integrate AWS services just like Lego blocks, and you can further expand this architecture to integrate with Amazon Kendra to build interactive FAQ or integrate with Amazon Comprehend Medical to build trainings for the healthcare industry. Happy building!

Introducing the Well-Architected Framework for Machine Learning

Post Syndicated from Shelbee Eigenbrode original https://aws.amazon.com/blogs/architecture/introducing-the-well-architected-framework-for-machine-learning/

We have published a new whitepaper, Machine Learning Lens, to help you design your machine learning (ML) workloads following cloud best practices. This whitepaper gives you an overview of the iterative phases of ML and introduces you to the ML and artificial intelligence (AI) services available on AWS using scenarios and reference architectures.

How often are you asking yourself “Am I doing this right?” when building and running applications in the cloud? You are not alone. To help you answer this question, we released the AWS Well-Architected Framework in 2015. The Framework is a formal approach for comparing your workload against AWS best practices and getting guidance on how to improve it.

We added “lenses” in 2017 that extended the Framework beyond its general perspective to provide guidance for specific technology domains, such as the Serverless Applications Lens, High Performance Computing (HPC) Lens, and IoT (Internet of Things) Lens. The new Machine Learning Lens further extends that guidance to include best practices specific to machine learning workloads.

A typical question we often hear is: “Should I use both the Lens whitepaper and the AWS Well-Architected Framework whitepaper when reviewing a workload?” Yes! We purposely exclude topics covered by the Framework in the Lens whitepapers. To fully evaluate your workload, use the Framework along with any applicable Lens whitepapers.

Applying the Machine Learning Lens

In the Machine Learning Lens, we focus on how to design, deploy, and architect your machine learning workloads in the AWS Cloud. The whitepaper starts by describing the general design principles for ML workloads. We then discuss the design principles for each of the five pillars of the Framework—operational excellence, security, reliability, performance efficiency, and cost optimization—as they relate to ML workloads.

Although the Lens is written to provide guidance across all pillars, each pillar is designed to be consumable by itself. This design results in some intended redundancy across the pillars. The following figure shows the structure of the whitepaper.

ML Lens-Well Architected (1)

The primary components of the AWS Well-Architected Framework are Pillars, Design Principles, Questions, and Best Practices. These components are illustrated in the following figure and are outlined in the AWS Well-Architected Framework whitepaper.

WA-Failure Management (2)

The Machine Learning Lens follows this pattern, with Design Principles, Questions, and Best Practices tailored for machine learning workloads. Each pillar has a set of questions, mapped to the design principles, which drives best practices for ML workloads.

ML Lens-Well Architected (2)

To review your ML workloads, start by answering the questions in each pillar. Identify opportunities for improvement as well as critical items for remediation. Then make a prioritized plan to address the improvement opportunities and remediation items.

We recommend that you regularly evaluate your workloads. Use the Lens throughout the lifecycle of your ML workload—not just at the end when you are about to go into production. When used during the design and implementation phase, the Machine Learning Lens can help you identify potential issues early, so that you have more time to address them.

Available now

The Machine Learning Lens is available now. Start using it today to review your existing workloads or to guide you in building new ones. Use the Lens to ensure that your ML workloads are architected with operational excellence, security, reliability, performance efficiency, and cost optimization in mind.

Deploying a serverless application using AWS CDK

Post Syndicated from Georges Leschener original https://aws.amazon.com/blogs/devops/deploying-a-serverless-application-using-aws-cdk/

There are multiple ways to deploy API endpoints, such as this example, in which you could use an application running on Amazon EC2 to demonstrate how to integrate Amazon ElastiCache with Amazon DocumentDB (with MongoDB capability). While the approach in this example help achieve great performance and reliability through the elasticity and the ability to scale up or down the number of EC2 instances in order to accommodate the load on the application, there is still however some operational overhead you still have to manage the EC2 instances yourself. One way of addressing the operational overhead issue and related costs could be to transform the application into a serverless architecture.

The example in this blog post uses an application that provides a similar use case, leveraging a serverless architecture showcasing some of the tools that are being leveraged by customers transitioning from lift-and-shift to building cloud-native applications. It uses Amazon API Gateway to provide the REST API endpoint connected to an AWS Lambda function to provide the business logic to read and write from an Amazon Aurora Serverless database. It also showcases the deployment of most of the infrastructure with the AWS Cloud Development Kit, known as the CDK. By moving your applications to cloud native architecture like the example showcased in this blog post, you will be able to realize a number of benefits including:

  • Fast and clean deployment of your application thereby achieving fast time to market
  • Reduce operational costs by serverless and managed services

Architecture Diagram

At the end of this blog, you have an AWS Cloud9 instance environment containing a CDK project which deploys an API Gateway and Lambda function. This Lambda function leverages a secret stored in your AWS Secrets Manager to read and write from your Aurora Serverless database through the data API, as shown in the following diagram.

 

Architecture diagram for deploying a serverless application using AWS CDK

This above architecture diagram showcases the resources to be deployed in your AWS Account

Through the blog post you will be creating the following resources:

  1. Deploy an Amazon Aurora Serverless database cluster
  2. Secure the cluster credentials in AWS Secrets Manager
  3. Create and populate your database in the AWS Console
  4. Deploy an AWS Cloud9 instance used as a development environment
  5. Initialize and configure an AWS Cloud Development Kit project including the definition of your Amazon API Gateway endpoint and AWS Lambda function
  6. Deploy an AWS CloudFormation template through the AWS Cloud Development Kit

Prerequisites

In order to deploy the CDK application, there are a few prerequisites that need to be met:

  1. Create an AWS account or use an existing account.
  2. Install Postman for testing purposes

Amazon Aurora serverless cluster creation

To begin, navigate to the AWS console to create a new Amazon RDS database.

  1. Select Create Database from the Amazon RDS service.
  2. Select Standard Create under Choose a database creation method.
  3. Select Serverless under Database features.
  4. Select Amazon Aurora as the engine type under Engine options.
  5. Enter db-blog for your DB Cluster Identifier.
  6. Expand the Additional Connectivity section and select the Data API option. This functionality enables you to access Aurora Serverless with web services-based applications. It also allows you to use the query editor feature for Aurora Serverless in order to run SQL queries against your database instance.
  7. Leave the default selection for everything else and choose Create Database.

Your database instance is created in a single availability zone (AZ), but an Aurora Serverless database cluster has a capability known as automatic multi-AZ failover, which enables Aurora to recreate the database instance in a different AZ should the current database instance or the AZ become unavailable. The storage volume for the cluster is spread across multiple AZs, since Aurora separates computation capacity and storage. This allows for data to remain available even if the database instance or the associated AZ is affected by an outage.

Securing database credentials with AWS Secrets Manager

After creating the database instance, the next step is to store your secrets for your database in AWS Secrets Manager.

  • Navigate to AWS Secrets Manager, and select Store a New Secret.
  • Leave the default selection (Credentials for RDS database) for the secret type. Enter your database username and password and then select the radio button for the database you created in the previous step (in this example, db-blog), as shown in the following screenshot.

database search in aws secrets manager

  •  Choose Next.
  • Enter a name and optionally a description. For the name, make sure to add the prefix rds-db-credentials/ as shown in the following screenshot.

AWS Secrets Manager Store a new secret window

  • Choose Next and leave the default selection.
  • Review your settings on the last page and choose Store to have your secrets created and stored in AWS Secrets Manager, which you can now use to connect to your database.

Creating and populating your Amazon Aurora Serverless database

After creating the DB cluster, create the database instance; create your tables and populate them; and finally, test a connection to ensure that you can query your database.

  • Navigate to the Amazon RDS service from the AWS console, and select your db-blog database cluster.
  • Select Query under Actions to open the Connect to database window as shown in the screenshot below . Enter your database connection details. You can copy your secret manager ARN from the Secrets Manager service and paste it into the corresponding field in the database connection window.

Amazon RDS connect to database window

  • To create the DB instance run the following SQL query: CREATE DATABASE recordstore;from the Query editor shown in the screenshot below:

 

Amazon RDS Query editor

  • Before you can run the following commands, make sure you are using the Recordstore database you just created by running the command:
USE recordstore;
  • Create a records table using the following command:
CREATE TABLE IF NOT EXISTS records (recordid INT PRIMARY KEY, title VARCHAR(255) NOT NULL, release_date DATE);
  • Create a singers table using the following command:
CREATE TABLE IF NOT EXISTS singers (id INT PRIMARY KEY, name VARCHAR(255) NOT NULL, nationality VARCHAR(255) NOT NULL, recordid INT NOT NULL, FOREIGN KEY (recordid) REFERENCES records (recordid) ON UPDATE RESTRICT ON DELETE CASCADE);
  • Add a record to your records table and a singer to your singers table.
INSERT INTO records(recordid,title,release_date) VALUES(001,'Liberian Girl','2012-05-03');
INSERT INTO singers(id,name,nationality,recordid) VALUES(100,'Michael Jackson','American',001);

If you have the AWS CLI set up on your computer, you can connect to your database and retrieve records.

To test it, use the rds-data execute-statement API within the AWS CLI to connect to your database via the data API web service and query the singers table, as shown below:

aws rds-data execute-statement —secret-arn "arn:aws:secretsmanager:REGION:xxxxxxxxxxx:secret:rds-db-credentials/xxxxxxxxxxxxxxx" —resource-arn "arn:aws:rds:us-east-1:xxxxxxxxxx:cluster:db-blog" —database demodb —sql "select * from singers" —output json

You should see the following result:

    "numberOfRecordsUpdated": 0,
    "records": [
        [
            {
                "longValue": 100
            },
            {
                "stringValue": "Michael Jackson"
            },
            {
                "stringValue": "American"
            },
            {
                "longValue": 1
            }
        ]
    ]
}

Creating a Cloud9 instance

To create a Cloud9 instance:

  1. Navigate to the Cloud9 console and select Create Environment.
  2. Name your environment AuroraServerlessBlog.
  3. Keep the default values under the Environment Settings.

Once your instance is launched, you see the screen shown in the following screenshot:

AWS Cloud9

 

You can now install the CDK in your environment. Run the following command inside your bash terminal on the blue section at the bottom of your screen:

npm install -g [email protected]

For the next section of this example, you mostly work on the command line of your Cloud9 terminal and on your file explorer.

Creating the CDK deployment

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to model and provision your cloud application resources using familiar programming languages. If you would like to familiarize yourself the CDKWorkshop is a great place to start.

First, create a working directory called RecordsApp and initialize a CDK project from a template.

Run the following commands:

mkdir RecordsApp
cd RecordsApp
cdk init app --language typescript
mkdir resources
npm install @aws-cdk/[email protected] @aws-cdk/[email protected] @aws-cdk/[email protected]

Now your instance should look like the example shown in the following screenshot:

AWS Cloud9 shell

 

You are mainly working in two directories:

  • Resources
  • Lib

Your initial set up is ready, and you can move into creating specific services and deploying them to your account.

Creating AWS resources using the CDK

  1. Follow these steps to create AWS resources using the CDK:
  2. Under the /lib folder,  create a new file called records_service.ts.
    • Inside of your new file, paste the following code with these changes:
    • Replace the dbARN with the ARN of your AuroraServerless DB ARN from the previous steps.

Replace the dbSecretARN with the ARN of your Secrets Manager secret ARN from the previous steps.

import core = require("@aws-cdk/core");
import apigateway = require("@aws-cdk/aws-apigateway");
import lambda = require("@aws-cdk/aws-lambda");
import iam = require("@aws-cdk/aws-iam");

//REPLACE THIS
const dbARN = "arn:aws:rds:XXXX:XXXX:cluster:aurora-serverless-blog";
//REPLACE THIS
const dbSecretARN = "arn:aws:secretsmanager:XXXXX:XXXXX:secret:rds-db-credentials/XXXXX";

export class RecordsService extends core.Construct {
  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    const lambdaRole = new iam.Role(this, 'AuroraServerlessBlogLambdaRole', {
      assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com'),
      managedPolicies: [
            iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonRDSDataFullAccess'),
            iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole')
        ]
    });

    const handler = new lambda.Function(this, "RecordsHandler", {
     role: lambdaRole,
     runtime: lambda.Runtime.NODEJS_12_X, // So we can use async in widget.js
     code: lambda.Code.asset("resources"),
     handler: "records.main",
     environment: {
       TABLE: dbARN,
       TABLESECRET: dbSecretARN,
       DATABASE: "recordstore"
     }
   });

    const api = new apigateway.RestApi(this, "records-api", {
      restApiName: "Records Service",
      description: "This service serves records."
   });

    const getRecordsIntegration = new apigateway.LambdaIntegration(handler, {
      requestTemplates: { "application/json": '{ "statusCode": 200 }' }
    });

    api.root.addMethod("GET", getRecordsIntegration); // GET /

    const record = api.root.addResource("{id}");
    const postRecordIntegration = new apigateway.LambdaIntegration(handler);
    const getRecordIntegration = new apigateway.LambdaIntegration(handler);

    record.addMethod("POST", postRecordIntegration); // POST /{id}
    record.addMethod("GET", getRecordIntegration); // GET/{id}
  }
}

This snippet of code will instruct the AWS CDK to create the following resources:

  • IAM role: AuroraServerlessBlogLambdaRole containing the following managed policies:
    • AmazonRDSDataFullAccess
    • service-role/AWSLambdaBasicExecutionRole
  • Lambda function: RecordsHandler, which has a Node.js 8.10 runtime and three environmental variables
  • API Gateway: Records Service, which has the following characteristics:
    • GET Method
      • GET /
    • { id } Resource
      • GET method
        • GET /{id}
      • POST method
        • POST /{id}

Now that you have a service, you need to add it to your stack under the /lib directory.

  1. Open the records_app-stack.ts
  2. Replace the contents of this file with the following:
import cdk = require('@aws-cdk/core'); 
import records_service = require('../lib/records_service'); 
export class RecordsAppStack extends cdk.Stack { 
  constructor(scope: cdk.Construct, id: string, props?
: cdk.StackProps) { 
    super(scope, id, props); 
    new records_service.RecordsService(this, 'Records'
); 
  } 
}
  1. Create the Lambda code that is invoked from the API Gateway endpoint. Under the /resources directory, create a file called records.js and paste the following code in this file
const AWS = require('aws-sdk');
var rdsdataservice = new AWS.RDSDataService();

exports.main = async function(event, context) {
  try {
    var method = event.httpMethod;
    var recordName = event.path.startsWith('/') ? event.path.substring(1) : event.path;
// Defining parameters for rdsdataservice
    var params = {
      resourceArn: process.env.TABLE,
      secretArn: process.env.TABLESECRET,
      database: process.env.DATABASE,
   }
   if (method === "GET") {
      if (event.path === "/") {
       //Here is where we are defining the SQL query that will be run at the DATA API
       params['sql'] = 'select * from records';
       const data = await rdsdataservice.executeStatement(params).promise();
       var body = {
           records: data
       };
       return {
         statusCode: 200,
         headers: {},
         body: JSON.stringify(body)
       };
     }
     else if (recordName) {
       params['sql'] = `SELECT singers.id, singers.name, singers.nationality, records.title FROM singers INNER JOIN records on records.recordid = singers.recordid WHERE records.title LIKE '${recordName}%';`
       const data = await rdsdataservice.executeStatement(params).promise();
       var body = {
           singer: data
       };
       return {
         statusCode: 200,
         headers: {},
         body: JSON.stringify(body)
       };
     }
   }
   else if (method === "POST") {
     var payload = JSON.parse(event.body);
     if (!payload) {
       return {
         statusCode: 400,
         headers: {},
         body: "The body is missing"
       };
     }

     //Generating random IDs
     var recordId = uuidv4();
     var singerId = uuidv4();

     //Parsing the payload from body
     var recordTitle = `${payload.recordTitle}`;
     var recordReleaseDate = `${payload.recordReleaseDate}`;
     var singerName = `${payload.singerName}`;
     var singerNationality = `${payload.singerNationality}`;

      //Making 2 calls to the data API to insert the new record and singer
      params['sql'] = `INSERT INTO records(recordid,title,release_date) VALUES(${recordId},"${recordTitle}","${recordReleaseDate}");`;
      const recordsWrite = await rdsdataservice.executeStatement(params).promise();
      params['sql'] = `INSERT INTO singers(recordid,id,name,nationality) VALUES(${recordId},${singerId},"${singerName}","${singerNationality}");`;
      const singersWrite = await rdsdataservice.executeStatement(params).promise();

      return {
        statusCode: 200,
        headers: {},
        body: JSON.stringify("Your record has been saved")
      };

    }
    // We got something besides a GET, POST, or DELETE
    return {
      statusCode: 400,
      headers: {},
      body: "We only accept GET, POST, and DELETE, not " + method
    };
  } catch(error) {
    var body = error.stack || JSON.stringify(error, null, 2);
    return {
      statusCode: 400,
      headers: {},
      body: body
    }
  }
}
function uuidv4() {
  return 'xxxx'.replace(/[xy]/g, function(c) {
    var r = Math.random() * 16 | 0, v = c == 'x' ? r : (r & 0x3 | 0x8);
    return v;
  });
}

Take a look at what this Lambda function is doing. You have two functions inside of your Lambda function. The first is the exported handler, which is defined as an asynchronous function. The second is a unique identifier function to generate four-digit random numbers you use as UIDs for your database records. In your handler function, you handle the following actions based on the event you get from API Gateway:

  • Method GETwith empty path /:
    • This calls the data API executeStatement method with the following SQL query:
SELECT * from records
  • Method GET with a record name in the path /{recordName}:
    • This calls the data API executeStatmentmethod with the following SQL query:
SELECT singers.id, singers.name, singers.nationality, records.title FROM singers INNER JOIN records on records.recordid = singers.recordid WHERE records.title LIKE '${recordName}%';
  • Method POST with a payload in the body:
    • This makes two calls to the data API executeStatement with the following SQL queries:
INSERT INTO records(recordid,titel,release_date) VALUES(${recordId},"${recordTitle}",“${recordReleaseDate}”);<br />INSERT INTO singers(recordid,id,name,nationality) VALUES(${recordId},${singerId},"${singerName}","${singerNationality}");

Now you have all the pieces you need to deploy your endpoint and Lambda function by running the following commands:

npm run build
cdk synth
cdk bootstrap
cdk deploy

If you change the Lambda code or add aditional AWS resources to your CDK deployment, you can redeploy the application by running all four commands in a single line:

npm run build; cdk synth; cdk bootstrap; cdk deploy

Testing with Postman

Once it’s done, you can test it using Postman:

GET = ‘RecordName’ in the path

  • example:
    • ENDPOINT/RecordName

POST = Payload in the body

  • example:
{
   "recordTitle" : "BlogTest",
   "recordReleaseDate" : "2020-01-01",
   "singerName" : "BlogSinger",
   "singerNationality" : "AWS"
}

Clean up

To clean up the resources created by the CDK, run the following command in your Cloud9 instance:

cdk destroy

To clean up the resources created manually, run the following commands:

aws rds delete-db-cluster --db-cluster-identifier Serverless-blog --skip-final-snapshot
aws secretsmanager delete-secret --secret-id XXXXX --recovery-window-in-days 7

Conclusion

This blog post demonstrated how to transform an application running on Amazon EC2 from a previous blog into serverless architecture by leveraging services such as Amazon API Gateway, Lambda, Cloud 9, AWS CDK, and Aurora Serverless. The benefit of serverless architecture is that it takes away the overhead of having to manage a server and helps reduce costs, as you only pay for the time in which your code executes.

This example used a record-store application written in Node.js that allows users to find their favorite singer’s record titles, as well as the dates when they were released. This example could be expanded, for instance, by adding a payment gateway and a shopping cart to allow users to shop and pay for their favorite records. You could then incorporate some machine learning into the application to predict user choice based on previous visits, purchases, or information provided through registration profiles.

 


 

About the Authors

Luis Lopez Soria is an AI/ML specialist solutions architect working with the AWS machine learning team. He works with AWS customers to help them with the adoption of Machine Learning on a large scale. He enjoys doing sports in addition to traveling around the world, exploring new foods and cultures.

 

 

 

 Georges Leschener is a Partner Solutions Architect in the Global System Integrator (GSI) team at Amazon Web Services. He works with our GSIs partners to help migrate customers’ workloads to AWS cloud, design and architect innovative solutions on AWS by applying AWS recommended best practices.

 

AWS Architecture Monthly Magazine: Education

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/aws-architecture-monthly-magazine-education/

Young man sitting on a stack of books with his laptopOne of the missions of the education industry is to educate the next generation of the industry-ready workforce. Whether K-12, higher education, or continuing education, enabling teachers and professors to effectively deliver curriculum and improve student performance is a goal of Education Technology (EdTech) and learning companies. Two trends for AWS use cases in education are: 1) accessible remote learning; and 2) remote collaboration. For brevity, there are other innovation trend areas in education that we didn’t focus on in our “Ask an Expert” interview despite their importance. Use cases around learning accessibility, student performance, and campus experience have taken advantage of Amazon Alexa, Amazon Lex, and a variety of AWS technology areas including artificial intelligence (AI) and machine learning, data lakes, analytics, and mobile development. To dive deep into a wider range of education use cases, we invite everyone to look at our AWS Education blog.

In this month’s issue

For May’s Education issue, we asked our expert, Yuriko Horvath, about general architecture patterns in the education space as well as what education customers need to think about and ask themselves before considering AWS.

  • Ask an Expert: Yuriko Horvath, AWS Manager of Education for Solutions Architecture
  • Blog: How to Build a Chatbot for Your School in Less Than an Hour (with step-by-step video instructions)
  • Case Study: Virginia Tech: Building Modern Analytics on Amazon Web Services
  • Solution: Video on Demand on AWS
  • Whitepaper: Teaching Big Data Skills with Amazon EMR

How to access the magazine

We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us star rating and comment on the Amazon Kindle Newsstand page or contact us anytime at [email protected].

BBVA: Helping Global Remote Working with Amazon AppStream 2.0

Post Syndicated from Joe Luis Prieto original https://aws.amazon.com/blogs/architecture/bbva-helping-global-remote-working-with-amazon-appstream-2-0/

This post was co-written with Javier Jose Pecete, Cloud Security Architect at BBVA, and Javier Sanz Enjuto, Head of Platform Protection – Security Architecture at BBVA.

Introduction

Speed and elasticity are key when you are faced with unexpected scenarios such as a massive employee workforce working from home or running more workloads on the public cloud if data centers face staffing reductions. AWS customers can instantly benefit from implementing a fully managed turnkey solution to help cope with these scenarios.

Companies not only need to use technology as the foundation to maintain business continuity and adjust their business model for the future, but they also must work to help their employees adapt to new situations.

About BBVA

BBVA is a customer-centric, global financial services group present in more than 30 countries throughout the world, has more than 126,000 employees, and serves more than 78 million customers.

Founded in 1857, BBVA is a leader in the Spanish market as well as the largest financial institution in Mexico. It has leading franchises in South America and the Sun Belt region of the United States and is the leading shareholder in Turkey’s Garanti BBVA.

The challenge

BBVA implemented a global remote working plan that protects customers and employees alike, including a significant reduction of the number of employees working in its branch offices. It also ensured continued and uninterrupted operations for both consumer and business customers by strengthening digital access to its full suite of services.

Following the company’s policies and adhering to new rules announced by national authorities in the last weeks, more than 86,000 employees from across BBVA’s international network of offices and its central service functions now work remotely.

BBVA, subject to a set of highly regulated requirements, and was looking for a global architecture to accommodate remote work. The solution needed to be fast to implement, adaptable to scale out gradually in the various countries in which it operates, and able to meet its operational, security, and regulatory requirements.

The architecture

BBVA selected Amazon AppStream 2.0 for particular use cases of applications that, due to their sensitivity, are not exposed to the internet (such as financial, employee, and security applications). Having had previous experience with the service, BBVA chose AppStream 2.0 to accommodate the remote work experience.

AppStream 2.0 is a fully managed application streaming service that provides users with instant access to their desktop applications from anywhere, regardless of what device they are using.

AppStream 2.0 works with IT environments, can be managed through the AWS SDK or console, automatically scales globally on demand, and is fully managed on AWS. This means there is no hardware or software to deploy, patch, or upgrade.

AppStream 2.0 can be managed through the AWS SDK (1)

  1. The streamed video and user inputs are sent over HTTPS and are SSL-encrypted between the AppStream 2.0 instance executing your applications, and your end users.
  2. Security groups are used to control network access to the customer VPC.
  3. AppStream 2.0 streaming instance access to the internet is through the customer VPC.

AppStream 2.0 fleets are created by use case to apply security restrictions depending on data sensitivity. Setting clipboard, file transfer, or print to local device options, the fleets control the data movement to and from employees’ AppStream 2.0 streaming sessions.

BBVA relies on a proprietary service called Heimdal to authenticate employees through the corporate identity provider. Heimdal calls the AppStream 2.0 API CreateStreamingURL operation to create a temporary URL to start a streaming session for the specified user, and tries to abstract the user from the service using:

  • FleetName to connect the most appropriate fleet based on the user’s location (BBVA has fleets deployed in Europe and America to improve the user’s experience.)
  • ApplicationId to launch the proper application without having to use an intermediate portal
  • SessionContext in situations where, for instance, the authentication service generates a token and needs to be forwarded to a browser application and injected as a session cookie

BBVA uses AWS Transit Gateway to build a hub-and-spoke network topology (2)

To simplify its overall network architecture, BBVA uses AWS Transit Gateway to build a hub-and-spoke network topology with full control over network routing and security.

There are situations where the application streamed in AppStream 2.0 needs to connect:

  1. On-premises, using AWS Direct Connect plus VPN providing an IPsec-encrypted private connection
  2. To the Internet through an outbound VPC proxy with domain whitelisting and content filtering to control the information and threats in the navigation of the employee

AppStream 2.0 activity is logged into a centralized repository support by Amazon S3 for detecting unusual behavior patterns and by regulatory requirements.

Conclusion

BBVA built a global solution reducing implementation time by 90% compared to on-premises projects, and is meeting its operational and security requirements. As well, the solution is helping with the company’s top concern: protecting the health and safety of its employees.

Cloudflare Bot Management: machine learning and more

Post Syndicated from Alex Bocharov original https://blog.cloudflare.com/cloudflare-bot-management-machine-learning-and-more/

Cloudflare Bot Management: machine learning and more

Introduction

Cloudflare Bot Management: machine learning and more

Building Cloudflare Bot Management platform is an exhilarating experience. It blends Distributed Systems, Web Development, Machine Learning, Security and Research (and every discipline in between) while fighting ever-adaptive and motivated adversaries at the same time.

This is the ongoing story of Bot Management at Cloudflare and also an introduction to a series of blog posts about the detection mechanisms powering it. I’ll start with several definitions from the Bot Management world, then introduce the product and technical requirements, leading to an overview of the platform we’ve built. Finally, I’ll share details about the detection mechanisms powering our platform.

Let’s start with Bot Management’s nomenclature.

Some Definitions

Bot – an autonomous program on a network that can interact with computer systems or users, imitating or replacing a human user’s behavior, performing repetitive tasks much faster than human users could.

Good bots – bots which are useful to businesses they interact with, e.g. search engine bots like Googlebot, Bingbot or bots that operate on social media platforms like Facebook Bot.

Bad bots – bots which are designed to perform malicious actions, ultimately hurting businesses, e.g. credential stuffing bots, third-party scraping bots, spam bots and sneakerbots.

Cloudflare Bot Management: machine learning and more

Bot Management – blocking undesired or malicious Internet bot traffic while still allowing useful bots to access web properties by detecting bot activity, discerning between desirable and undesirable bot behavior, and identifying the sources of the undesirable activity.

WAF – a security system that monitors and controls network traffic based on a set of security rules.

Gathering requirements

Cloudflare has been stopping malicious bots from accessing websites or misusing APIs from the very beginning, at the same time helping the climate by offsetting the carbon costs from the bots. Over time it became clear that we needed a dedicated platform which would unite different bot fighting techniques and streamline the customer experience. In designing this new platform, we tried to fulfill the following key requirements.

  • Complete, not complex – customers can turn on/off Bot Management with a single click of a button, to protect their websites, mobile applications, or APIs.
  • Trustworthy – customers want to know whether they can trust the website visitor is who they say they are and provide a certainty indicator for that trust level.
  • Flexible – customers should be able to define what subset of the traffic Bot Management mitigations should be applied to, e.g. only login URLs, pricing pages or sitewide.
  • Accurate – Bot Management detections should have a very small error, e.g. none or very few human visitors ever should be mistakenly identified as bots.
  • Recoverable – in case a wrong prediction was made, human visitors still should be able to access websites as well as good bots being let through.

Moreover, the goal for new Bot Management product was to make it work well on the following use cases:

Cloudflare Bot Management: machine learning and more

Technical requirements

Additionally to the product requirements above, we engineers had a list of must-haves for the new Bot Management platform. The most critical were:

  • Scalability – the platform should be able to calculate a score on every request, even at over 10 million requests per second.
  • Low latency – detections must be performed extremely quickly, not slowing down request processing by more than 100 microseconds, and not requiring additional hardware.
  • Configurability – it should be possible to configure what detections are applied on what traffic, including on per domain/data center/server level.
  • Modifiability – the platform should be easily extensible with more detection mechanisms, different mitigation actions, richer analytics and logs.
  • Security – no sensitive information from one customer should be used to build models that protect another customer.
  • Explainability & debuggability – we should be able to explain and tune predictions in an intuitive way.

Equipped with these requirements, back in 2018, our small team of engineers got to work to design and build the next generation of Cloudflare Bot Management.

Meet the Score

“Simplicity is the ultimate sophistication.”
– Leonardo Da Vinci

Cloudflare operates on a vast scale. At the time of this writing, this means covering 26M+ Internet properties, processing on average 11M requests per second (with peaks over 14M), and examining more than 250 request attributes from different protocol levels. The key question is how to harness the power of such “gargantuan” data to protect all of our customers from modern day cyberthreats in a simple, reliable and explainable way?

Bot management is hard. Some bots are much harder to detect and require looking at multiple dimensions of request attributes over a long time, and sometimes a single request attribute could give them away. More signals may help, but are they generalizable?

When we classify traffic, should customers decide what to do with it or are there decisions we can make on behalf of the customer? What concept could possibly address all these uncertainty problems and also help us to deliver on the requirements from above?

As you might’ve guessed from the section title, we came up with the concept of Trusted Score or simply The Scoreone thing to rule them all – indicating the likelihood between 0 and 100 whether a request originated from a human (high score) vs. an automated program (low score).

Cloudflare Bot Management: machine learning and more
“One Ring to rule them all” by idreamlikecrazy, used under CC BY / Desaturated from original

Okay, let’s imagine that we are able to assign such a score on every incoming HTTP/HTTPS request, what are we or the customer supposed to do with it? Maybe it’s enough to provide such a score in the logs. Customers could then analyze them on their end, find the most frequent IPs with the lowest scores, and then use the Cloudflare Firewall to block those IPs. Although useful, such a process would be manual, prone to error and most importantly cannot be done in real time to protect the customer’s Internet property.

Fortunately, around the same time we started worked on this system , our colleagues from the Firewall team had just announced Firewall Rules. This new capability provided customers the ability to control requests in a flexible and intuitive way, inspired by the widely known Wireshark®  language. Firewall rules supported a variety of request fields, and we thought – why not have the score be one of these fields? Customers could then write granular rules to block very specific attack types. That’s how the cf.bot_management.score field was born.

Having a score in the heart of Cloudflare Bot Management addressed multiple product and technical requirements with one strike – it’s simple, flexible, configurable, and it provides customers with telemetry about bots on a per request basis. Customers can adjust the score threshold in firewall rules, depending on their sensitivity to false positives/negatives. Additionally, this intuitive score allows us to extend our detection capabilities under the hood without customers needing to adjust any configuration.

So how can we produce this score and how hard is it? Let’s explore it in the following section.

Architecture overview

What is powering the Bot Management score? The short answer is a set of microservices. Building this platform we tried to re-use as many pipelines, databases and components as we could, however many services had to be built from scratch. Let’s have a look at overall architecture (this overly simplified version contains Bot Management related services):

Cloudflare Bot Management: machine learning and more

Core Bot Management services

In a nutshell our systems process data received from the edge data centers, produce and store data required for bot detection mechanisms using the following technologies:

  • Databases & data storesKafka, ClickHouse, Postgres, Redis, Ceph.
  • Programming languages – Go, Rust, Python, Java, Bash.
  • Configuration & schema management – Salt, Quicksilver, Cap’n Proto.
  • Containerization – Docker, Kubernetes, Helm, Mesos/Marathon.

Each of these services is built with resilience, performance, observability and security in mind.

Edge Bot Management module

All bot detection mechanisms are applied on every request in real-time during the request processing stage in the Bot Management module running on every machine at Cloudflare’s edge locations. When a request comes in we extract and transform the required request attributes and feed them to our detection mechanisms. The Bot Management module produces the following output:

Firewall fieldsBot Management fields
cf.bot_management.score – an integer indicating the likelihood between 0 and 100 whether a request originated from an automated program (low score) to a human (high score).
cf.bot_management.verified_bot – a boolean indicating whether such request comes from a Cloudflare whitelisted bot.
cf.bot_management.static_resource – a boolean indicating whether request matches file extensions for many types of static resources.

Cookies – most notably it produces cf_bm, which helps manage incoming traffic that matches criteria associated with bots.

JS challenges – for some of our detections and customers we inject into invisible JavaScript challenges, providing us with more signals for bot detection.

Detection logs – we log through our data pipelines to ClickHouse details about each applied detection, used features and flags, some of which are used for analytics and customer logs, while others are used to debug and improve our models.

Once the Bot Management module has produced the required fields, the Firewall takes over the actual bot mitigation.

Firewall integration

The Cloudflare Firewall’s intuitive dashboard enables users to build powerful rules through easy clicks and also provides Terraform integration. Every request to the firewall is inspected against the rule engine. Suspicious requests can be blocked, challenged or logged as per the needs of the user while legitimate requests are routed to the destination, based on the score produced by the Bot Management module and the configured threshold.

Cloudflare Bot Management: machine learning and more

Firewall rules provide the following bot mitigation actions:

  • Log – records matching requests in the Cloudflare Logs provided to customers.
  • Bypass – allows customers to dynamically disable Cloudflare security features for a request.
  • Allow – matching requests are exempt from challenge and block actions triggered by other Firewall Rules content.
  • Challenge (Captcha) – useful for ensuring that the visitor accessing the site is human, and not automated.
  • JS Challenge – useful for ensuring that bots and spam cannot access the requested resource; browsers, however, are free to satisfy the challenge automatically.
  • Block – matching requests are denied access to the site.

Our Firewall Analytics tool, powered by ClickHouse and GraphQL API, enables customers to quickly identify and investigate security threats using an intuitive interface. In addition to analytics, we provide detailed logs on all bots-related activity using either the Logpull API and/or LogPush, which provides the easy way to get your logs to your cloud storage.

Cloudflare Workers integration

In case a customer wants more flexibility on what to do with the requests based on the score, e.g. they might want to inject new, or change existing, HTML page content, or serve incorrect data to the bots, or stall certain requests, Cloudflare Workers provide an option to do that. For example, using this small code-snippet, we can pass the score back to the origin server for more advanced real-time analysis or mitigation:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})
 
async function handleRequest(request) {
  request = new Request(request);
 
  request.headers.set("Cf-Bot-Score", request.cf.bot_management.score)
 
  return fetch(request);
}

Now let’s have a look into how a single score is produced using multiple detection mechanisms.

Detection mechanisms

Cloudflare Bot Management: machine learning and more

The Cloudflare Bot Management platform currently uses five complementary detection mechanisms, producing their own scores, which we combine to form the single score going to the Firewall. Most of the detection mechanisms are applied on every request, while some are enabled on a per customer basis to better fit their needs.

Cloudflare Bot Management: machine learning and more
Cloudflare Bot Management: machine learning and more

Having a score on every request for every customer has the following benefits:

  • Ease of onboarding – even before we enable Bot Management in active mode, we’re able to tell how well it’s going to work for the specific customer, including providing historical trends about bot activity.
  • Feedback loop – availability of the score on every request along with all features has tremendous value for continuous improvement of our detection mechanisms.
  • Ensures scaling – if we can compute for score every request and customer, it means that every Internet property behind Cloudflare is a potential Bot Management customer.
  • Global bot insights – Cloudflare is sitting in front of more than 26M+ Internet properties, which allows us to understand and react to the tectonic shifts happening in security and threat intelligence over time.

Overall globally, more than third of the Internet traffic visible to Cloudflare is coming from bad bots, while Bot Management customers have the ratio of bad bots even higher at ~43%!

Cloudflare Bot Management: machine learning and more
Cloudflare Bot Management: machine learning and more

Let’s dive into specific detection mechanisms in chronological order of their integration with Cloudflare Bot Management.

Machine learning

The majority of decisions about the score are made using our machine learning models. These were also the first detection mechanisms to produce a score and to on-board customers back in 2018. The successful application of machine learning requires data high in Quantity, Diversity, and Quality, and thanks to both free and paid customers, Cloudflare has all three, enabling continuous learning and improvement of our models for all of our customers.

At the core of the machine learning detection mechanism is CatBoost  – a high-performance open source library for gradient boosting on decision trees. The choice of CatBoost was driven by the library’s outstanding capabilities:

  • Categorical features support – allowing us to train on even very high cardinality features.
  • Superior accuracy – allowing us to reduce overfitting by using a novel gradient-boosting scheme.
  • Inference speed – in our case it takes less than 50 microseconds to apply any of our models, making sure request processing stays extremely fast.
  • C and Rust API – most of our business logic on the edge is written using Lua, more specifically LuaJIT, so having a compatible FFI interface to be able to apply models is fantastic.

There are multiple CatBoost models run on Cloudflare’s Edge in the shadow mode on every request on every machine. One of the models is run in active mode, which influences the final score going to Firewall. All ML detection results and features are logged and recorded in ClickHouse for further analysis, model improvement, analytics and customer facing logs. We feed both categorical and numerical features into our models, extracted from request attributes and inter-request features built using those attributes, calculated and delivered by the Gagarin inter-requests features platform.

We’re able to deploy new ML models in a matter of seconds using an extremely reliable and performant Quicksilver configuration database. The same mechanism can be used to configure which version of an ML model should be run in active mode for a specific customer.

A deep dive into our machine learning detection mechanism deserves a blog post of its own and it will cover how do we train and validate our models on trillions of requests using GPUs, how model feature delivery and extraction works, and how we explain and debug model predictions both internally and externally.

Heuristics engine

Not all problems in the world are the best solved with machine learning. We can tweak the ML models in various ways, but in certain cases they will likely underperform basic heuristics. Often the problems machine learning is trying to solve are not entirely new. When building the Bot Management solution it became apparent that sometimes a single attribute of the request could give a bot away. This means that we can create a bunch of simple rules capturing bots in a straightforward way, while also ensuring lowest false positives.

The heuristics engine was the second detection mechanism integrated into the Cloudflare Bot Management platform in 2019 and it’s also applied on every request. We have multiple heuristic types and hundreds of specific rules based on certain attributes of the request, some of which are very hard to spoof. When any of the requests matches any of the heuristics – we assign the lowest possible score of 1.

The engine has the following properties:

  • Speed – if ML model inference takes less than 50 microseconds per model, hundreds of heuristics can be applied just under 20 microseconds!
  • Deployability – the heuristics engine allows us to add new heuristic in a matter of seconds using Quicksilver, and it will be applied on every request.
  • Vast coverage – using a set of simple heuristics allows us to classify ~15% of global traffic and ~30% of Bot Management customers’ traffic as bots. Not too bad for a few if conditions, right?
  • Lowest false positives – because we’re very sure and conservative on the heuristics we add, this detection mechanism has the lowest FP rate among all detection mechanisms.
  • Labels for ML – because of the high certainty we use requests classified with heuristics to train our ML models, which then can generalize behavior learnt from from heuristics and improve detections accuracy.

So heuristics gave us a lift when tweaked with machine learning and they contained a lot of the intuition about the bots, which helped to advance the Cloudflare Bot Management platform and allowed us to onboard more customers.

Behavioral analysis

Machine learning and heuristics detections provide tremendous value, but both of them require human input on the labels, or basically a teacher to distinguish between right and wrong. While our supervised ML models can generalize well enough even on novel threats similar to what we taught them on, we decided to go further. What if there was an approach which doesn’t require a teacher, but rather can learn to distinguish bad behavior from the normal behavior?

Enter the behavioral analysis detection mechanism, initially developed in 2018 and integrated with the Bot Management platform in 2019. This is an unsupervised machine learning approach, which has the following properties:

  • Fitting specific customer needs – it’s automatically enabled for all Bot Management customers, calculating and analyzing normal visitor behavior over an extended period of time.
  • Detects bots never seen before – as it doesn’t use known bot labels, it can detect bots and anomalies from the normal behavior on specific customer’s website.
  • Harder to evade – anomalous behavior is often a direct result of the bot’s specific goal.

Please stay tuned for a more detailed blog about behavioral analysis models and the platform powering this incredible detection mechanism, protecting many of our customers from unseen attacks.

Verified bots

So far we’ve discussed how to detect bad bots and humans. What about good bots, some of which are extremely useful for the customer website? Is there a need for a dedicated detection mechanism or is there something we could use from previously described detection mechanisms? While the majority of good bot requests (e.g. Googlebot, Bingbot, LinkedInbot) already have low score produced by other detection mechanisms, we also need a way to avoid accidental blocks of useful bots. That’s how the Firewall field cf.bot_management.verified_bot came into existence in 2019, allowing customers to decide for themselves whether they want to let all of the good bots through or restrict access to certain parts of the website.

The actual platform calculating Verified Bot flag deserves a detailed blog on its own, but in the nutshell it has the following properties:

  • Validator based approach – we support multiple validation mechanisms, each of them allowing us to reliably confirm good bot identity by clustering a set of IPs.
  • Reverse DNS validator – performs a reverse DNS check to determine whether or not a bots IP address matches its alleged hostname.
  • ASN Block validator – similar to rDNS check, but performed on ASN block.
  • Downloader validator – collects good bot IPs from either text files or HTML pages hosted on bot owner sites.
  • Machine learning validator – uses an unsupervised learning algorithm, clustering good bot IPs which are not possible to validate through other means.
  • Bots Directory – a database with UI that stores and manages bots that pass through the Cloudflare network.
Cloudflare Bot Management: machine learning and more
Bots directory UI sample‌‌

Using multiple validation methods listed above, the Verified Bots detection mechanism identifies hundreds of unique good bot identities, belonging to different companies and categories.

JS fingerprinting

When it comes to Bot Management detection quality it’s all about the signal quality and quantity. All previously described detections use request attributes sent over the network and analyzed on the server side using different techniques. Are there more signals available, which can be extracted from the client to improve our detections?

As a matter of fact there are plenty, as every browser has unique implementation quirks. Every web browser graphics output such as canvas depends on multiple layers such as hardware (GPU) and software (drivers, operating system rendering). This highly unique output allows precise differentiation between different browser/device types. Moreover, this is achievable without sacrificing website visitor privacy as it’s not a supercookie, and it cannot be used to track and identify individual users, but only to confirm that request’s user agent matches other telemetry gathered through browser canvas API.

This detection mechanism is implemented as a challenge-response system with challenge injected into the webpage on Cloudflare’s edge. The challenge is then rendered in the background using provided graphic instructions and the result sent back to Cloudflare for validation and further action such as  producing the score. There is a lot going on behind the scenes to make sure we get reliable results without sacrificing users’ privacy while being tamper resistant to replay attacks. The system is currently in private beta and being evaluated for its effectiveness and we already see very promising results. Stay tuned for this new detection mechanism becoming widely available and the blog on how we’ve built it.

This concludes an overview of the five detection mechanisms we’ve built so far. It’s time to sum it all up!

Summary

Cloudflare has the unique ability to collect data from trillions of requests flowing through its network every week. With this data, Cloudflare is able to identify likely bot activity with Machine Learning, Heuristics, Behavioral Analysis, and other detection mechanisms. Cloudflare Bot Management integrates seamlessly with other Cloudflare products, such as WAF  and Workers.

Cloudflare Bot Management: machine learning and more

All this could not be possible without hard work across multiple teams! First of all thanks to everybody on the Bots Team for their tremendous efforts to make this platform come to life. Other Cloudflare teams, most notably: Firewall, Data, Solutions Engineering, Performance, SRE, helped us a lot to design, build and support this incredible platform.

Cloudflare Bot Management: machine learning and more
Bots team during Austin team summit 2019 hunting bots with axes 🙂

Lastly, there are more blogs from the Bots series coming soon, diving into internals of our detection mechanisms, so stay tuned for more exciting stories about Cloudflare Bot Management!

Building a Scalable Document Pre-Processing Pipeline

Post Syndicated from Joel Knight original https://aws.amazon.com/blogs/architecture/building-a-scalable-document-pre-processing-pipeline/

In a recent customer engagement, Quantiphi, Inc., a member of the Amazon Web Services Partner Network, built a solution capable of pre-processing tens of millions of PDF documents before sending them for inference by a machine learning (ML) model. While the customer’s use case—and hence the ML model—was very specific to their needs, the pipeline that does the pre-processing of documents is reusable for a wide array of document processing workloads. This post will walk you through the pre-processing pipeline architecture.

Pre-processing pipeline architecture-SM

Architectural goals

Quantiphi established the following goals prior to starting:

  • Loose coupling to enable independent scaling of compute components, flexible selection of compute services, and agility as the customer’s requirements evolved.
  • Work backwards from business requirements when making decisions affecting scale and throughput and not simply because “fastest is best.” Scale components only where it makes sense and for maximum impact.
  •  Log everything at every stage to enable troubleshooting when something goes wrong, provide a detailed audit trail, and facilitate cost optimization exercises by identifying usage and load of every compute component in the architecture.

Document ingestion

The documents are initially stored in a staging bucket in Amazon Simple Storage Service (Amazon S3). The processing pipeline is kicked off when the “trigger” Amazon Lambda function is called. This Lambda function passes parameters such as the name of the staging S3 bucket and the path(s) within the bucket which are to be processed to the “ingestion app.”

The ingestion app is a simple application that runs a web service to enable triggering a batch and lists documents from the S3 bucket path(s) received via the web service. As the app processes the list of documents, it feeds the document path, S3 bucket name, and some additional metadata to the “ingest” Amazon Simple Queue Service (Amazon SQS) queue. The ingestion app also starts the audit trail for the document by writing a record to the Amazon Aurora database. As the document moves downstream, additional records are added to the database. Records are joined together by a unique ID and assigned to each document by the ingestion app and passed along throughout the pipeline.

Chunking the documents

In order to maximize grip and control, the architecture is built to submit single-page files to the ML model. This enables correlating an inference failure to a specific page instead of a whole document (which may be many pages long). It also makes identifying the location of features within the inference results an easier task. Since the documents being processed can have varied sizes, resolutions, and page count, a big part of the pre-processing pipeline is to chunk a document up into its component pages prior to sending it for inference.

The “chunking orchestrator” app repeatedly pulls a message from the ingest queue and retrieves the document named therein from the S3 bucket. The PDF document is then classified along two metrics:

  • File size
  • Number of pages

We use these metrics to determine which chunking queue the document is sent to:

  • Large: Greater than 10MB in size or greater than 10 pages
  • Small: Less than or equal to 10MB and less than or equal to 10 pages
  • Single page: Less than or equal to 10MB and exactly one page

Each of these queues is serviced by an appropriately sized compute service that breaks the document down into smaller pieces, and ultimately, into individual pages.

  • Amazon Elastic Cloud Compute (EC2) processes large documents primarily because of the high memory footprint needed to read large, multi-gigabyte PDF files into memory. The output from these workers are smaller PDF documents that are stored in Amazon S3. The name and location of these smaller documents is submitted to the “small documents” queue.
  • Small documents are processed by a Lambda function that decomposes the document into single pages that are stored in Amazon S3. The name and location of these single page files is sent to the “single page” queue.

The Dead Letter Queues (DLQs) are used to hold messages from their respective size queue which are not successfully processed. If messages start landing in the DLQs, it’s an indication that there is a problem in the pipeline. For example, if messages start landing in the “small” or “single page” DLQ, it could indicate that the Lambda function processing those respective queues has reached its maximum run time.

An Amazon CloudWatch Alarm monitors the depth of each DLQ. Upon seeing DLQ activity, a notification is sent via Amazon Simple Notification Service (Amazon SNS) so an administrator can then investigate and make adjustments such as tuning the sizing thresholds to ensure the Lambda functions can finish before reaching their maximum run time.

In order to ensure no documents are left behind in the active run, there is a failsafe in the form of an Amazon EC2 worker that retrieves and processes messages from the DLQs. This failsafe app breaks a PDF all the way down into individual pages and then does image conversion.

For documents that don’t fall into a DLQ, they make it to the “single page” queue. This queue drives each page through the “image conversion” Lambda function which converts the single page file from PDF to PNG format. These PNG files are stored in Amazon S3.

Sending for inference

At this point, the documents have been chunked up and are ready for inference.

When the single-page image files land in Amazon S3, an S3 Event Notification is fired which places a message in a “converted image” SQS queue which in turn triggers the “model endpoint” Lambda function. This function calls an API endpoint on an Amazon API Gateway that is fronting the Amazon SageMaker inference endpoint. Using API Gateway with SageMaker endpoints avoided throttling during Lambda function execution due to high volumes of concurrent calls to the Amazon SageMaker API. This pattern also resulted in a 2x inference throughput speedup. The Lambda function passes the document’s S3 bucket name and path to the API which in turn passes it to the auto scaling SageMaker endpoint. The function reads the inference results that are passed back from API Gateway and stores them in Amazon Aurora.

The inference results as well as all the telemetry collected as the document was processed can be queried from the Amazon Aurora database to build reports showing number of documents processed, number of documents with failures, and number of documents with or without whatever feature(s) the ML model is trained to look for.

Summary

This architecture is able to take PDF documents that range in size from single page up to thousands of pages or gigabytes in size, pre-process them into single page image files, and then send them for inference by a machine learning model. Once triggered, the pipeline is completely automated and is able to scale to tens of millions of pages per batch.

In keeping with the architectural goals of the project, Amazon SQS is used throughout in order to build a loosely coupled system which promotes agility, scalability, and resiliency. Loose coupling also enables a high degree of grip and control over the system making it easier to respond to changes in business needs as well as focusing tuning efforts for maximum impact. And with every compute component logging everything it does, the system provides a high degree of auditability and introspection which facilitates performance monitoring, and detailed cost optimization.

AWS Architecture Monthly Magazine: Automotive

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/aws-architecture-monthly-magazine-automotive/

AWS-Architecture-Monthly-Automotive cover-320Connected, autonomous, shared, and electric vehicle trends are converging to revolutionize the automotive industry. In this unprecedented age of innovation, automotive companies rely on AWS to fuel their digital transformation efforts, and get their products to market faster, while retaining ownership and control of their data and brand experience.

AWS provides the broadest and deepest set of capabilities, including artificial intelligence (AI), Internet of Things (IoT), HPC, and data analytics, the highest performance and security, the largest customer and partner community, and a relentless pace of innovation.

In this month’s issue:

For April’s Automotive issue, we spoke with Dean Phillips, AWS Automotive Tech Leader, about some of the architecture pattern trends of the industry as well as what customers should ask themselves before considering AWS. Dean also talks about different trends within the industry in cloud versus on-premises.

We also look back at Amazon Automotive exhibit at the Consumer Electronics Show (CES 2020), review at a Toyota case study, and provide information on the AWS Connected Vehicle Solution.

  • Ask an Expert: Dean Phillips, AWS Tech Leader, Automotive
  • Blog: 5 Automotive Trends at CES 2020
  • Case Study: Toyota Research Institute
  •  Solution: AWS Connected Vehicle Solution
  •  Whitepaper: Connected Vehicles and the Cloud

How to Access the Magazine

We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us star rating and comment on the Amazon Kindle Newsstand page or contact us anytime at [email protected].

Serving Billions of Ads in Just 100 ms Using Amazon Elasticache for Redis

Post Syndicated from Rodrigo Asensio original https://aws.amazon.com/blogs/architecture/serving-billions-of-ads-with-amazon-elasticache-for-redis/

This post was co-written with Lucas Ceballos, CTO of Smadex

Introduction

Showing ads may seem to be a simple task, but it’s not. Showing the right ad to the right user is an incredibly complex challenge that involves multiple disciplines such as artificial intelligence, data science, and software engineering. Doing it one million times per second with a 100-ms constraint is even harder.

In the ad-tech business, speed and infrastructure costs are the keys to success. The less the final user waits for an ad, the higher the probability of that user clicking on the ad. Doing that while keeping infrastructure costs under control is crucial for business profitability.

About Smadex

Smadex is the leading mobile-first programmatic advertising platform specifically built to deliver best user acquisition performance and complete transparency.

Its state-of-the-art digital signal processing (DSP) technology provides advertisers with the tools they need to achieve their goals and ROI, with measurable results from web forms, post-app install events, store visits, and sales.

Smadex advertising architecture

What does showing ads look like under the hood? At Smadex, our technology works based on the OpenRTB (Real-Time Bidding) protocol.

RTB is a means by which advertising inventory is bought and sold on a per-impression basis, via programmatic instantaneous auction, which is similar to financial markets.

To show ads, we participate in auctions deciding in real time which ad to show and how much to bid trying to optimize the cost of every impression.

High level diagram

  1. The final user browses the publisher’s website or app.
  2. Ad-exchange is called to start a new auction.
  3. Smadex receives the bid request and has to decide which ad to show and how much to offer in just 100 ms (and this is happening one million times per second).
  4. If Smadex won the auction, the ad must be sent and rendered on the publisher’s website or app.
  5. In the end, the user interacts with the ad sending new requests and events to Smadex platform.

Flow of data

As you can see in the previous diagram, showing ads is just one part of the challenge. After the ad is shown, the final user interacts with it in multiple ways, such as clicking it, installing an application, subscribing to a service, etc. This happens during a determined period that we call the “attribution window.” All of those interactions must be tracked and linked to the original bid transaction (using the request_id parameter).

Doing this is complicated: billions of bid transactions must be stored and available so that they can be quickly accessed every time the user interacts with the ad. The longer we store the transactions, the longer we can “wait” for an interaction to take place, and the better for our business and our clients, too.

Detailed diagram

Challenge #1: Cost

The challenge is: What kind of database can store billions of records per day, with at least a 30-day retention capacity (attribution window), be accessed by key-value, and all by spending as little as possible?

The answer is…none! Based on our research, all the available options that met the technical requirements were way out of our budget.

So…how to solve it? Here is when creativity and the combination of different AWS services comes into place.

We started to analyze the time dispersion of the events trying to find some clues. The interesting thing we spotted was that 90% of what we call “post-bid events” (impression, click, install, etc.) happened within one hour after the auction took place.

That means that we can process 90% of post-bid events by storing just one hour of bids.

Under our current workload, in one hour we participate in approximately 3.7 billion auctions generating 100 million bid records of an average 600 bytes each. This adds up to 55 gigabytes per hour, an easier amount of data to process.

Instead of thinking about one single database to store all the bid requests, we decided to split bids into two different categories:

  • Hot Bid: A request that took place within the last hour (small amount and frequently accessed)
  • Cold Bid: A request that took place more than our hour ago (huge amount and infrequently accessed)

Amazon ElastiCache for Redis is the best option to store 55 GB of data in memory, which gives us the ability to query in a key-value way with the lowest possible latency.

Hot Bids flow

Hot Bids flow diagram

  1. Every new bid is a hot bid by definition so it’s going to be stored in the hot bids Redis cluster.
  2. At the moment of the user interaction with the ad, the Smadex tracker component receives an HTTPS notification, including the bid request UUID that originated it.
  3. Based on the date of occurrence extracted from the received UUID, the tracker component can determine if it’s looking for a hot bid or not. If it’s a hot bid, the tracker reads it directly from Redis performing a key-value lookup query.

It’s been easy so far but what to do with the other 29 days and 23 hours we need to store?

Challenge #2: Performance

As we previously mentioned, cold bids are a huge infrequently accessed number of records with only 10% of post-bid events pointing to them. That sounds like a good use case for an inexpensive and slower data store like Amazon S3.

Thanks to the S3 low-cost storage prices combined with the ability to query S3 objects directly using Amazon Athena, we were able to optimize our costs by storing and querying cold bids by implementing a serverless architecture.

Cold Bids Flow

Cold Bids flow diagram

  1. Incoming bids are buffered by Fluentd and flushed to S3 every one minute in JSON format. Every single file flushed to S3 contains all the bids processed by a specific EC2 instance for one minute.
  2. An AWS Lambda function is automatically triggered on every new PutObject event from S3. This function transforms the JSON records to Parquet format and will save it back the S3 bucket, but this time into a specific partition folder based on file creation timestamp.
  3. As seen on the hot bids flow, the tracker component will determine if it’s looking for a hot or a cold bid based on the extracted timestamp of the request UUID. In this case, the cold bid will be retrieved by running an Amazon Athena look-up query leveraging the use of partitions and Parquet format to reduce as much as possible the latency and data that needs to be scanned.

Conclusion

Thanks to this combined approach using different technologies and a variety of AWS services we were able to extend our attribution window from 30 to 90 days while reducing the infrastructure costs by 45%.

 

 

Serverless Stream-Based Processing for Real-Time Insights

Post Syndicated from Justin Pirtle original https://aws.amazon.com/blogs/architecture/serverless-stream-based-processing-for-real-time-insights/

Building on our previous posts regarding messaging patterns and queue-based processing, we now explore stream-based processing and how it helps you achieve low-latency, near real-time data processing in your applications. AWS offers two managed services for streaming, Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (Amazon MSK).

What is streaming data?

At AWS, we define streaming data as data that is emitted at high volume in a continuous, incremental manner with the goal of low-latency processing. Whereas traditional batch-oriented business intelligence would offer insights in retrospect after months, days, or hours have passed, stream-based processing can offer actionable insights in real time. Stream-based processing is commonly used to respond to clickstream events, rapidly ingest various types of logs, and extract, transform, and load (ETL) data in real-time into data lakes and data warehouses.

Amazon Kinesis is the AWS service that makes it easy to collect, process, and analyze such real-time, streaming data with four different capabilities:

For this blog post, we focus on Kinesis Data Streams and Kinesis Data Firehose, since both of these services are foundational for streaming, ingestion, buffering, and processing in your streaming data pipeline.

Kinesis Data Streams

Amazon Kinesis Data Streams is a massively scalable service that can continuously capture gigabytes of data per second from hundreds of thousands of sources. Like many distributed systems, Kinesis Data Streams achieves this level of scalability by partitioning or sharding your data where records are simultaneously written to and read from different shards in parallel. All Kinesis Data Streams require allocation of at least one shard and you choose how many shards you want to allocate to a given stream.

When writing to a shard in a Kinesis Data Stream, each shard supports ingestion of up to 1 MB of data per second or 1,000 records written per second. When reading from a shard, each shard supports output of 2 MB of data per second. You choose an initial number of shards to allocate for your Kinesis Data Stream, then can update your shard allocation over time. Increasing your shard allocation enables your application to easily scale from thousands of records to millions of records written per second.

Producing streaming data

Streaming data producers are processes that put records onto a Kinesis stream by calling the putRecord API to write a single record or putRecords API to write multiple records in a single invocation. Common approaches for producing messages including direct use of AWS tools, including:

  • AWS SDK, which simplifies authentication and other semantics of invoking AWS service APIs
  • Amazon Kinesis Agent, which enables local file/log monitoring and rotation sending in real time
  • Amazon Kinesis Producer Library, which simplifies aggregating records into larger payloads to improve throughput.

Additionally, several AWS services natively integrate with Amazon Kinesis as a data producer:

There are also several third-party services that offer native integration as data producers, including:

Regardless of the producer service/tool of choice, all data producers put records onto a stream by providing a partition key, stream name, and the data itself, which altogether must not exceed 1 MB in size. The partition key provided is used to determine which shard the data should be written to on the stream. Amazon Kinesis Data Streams offers ordering guarantees and maintain message ordering within a given shard in a stream using sequence numbers to track the unique position of each message sent.

Consuming streaming data

Once records are written to a Kinesis Data Stream, they are buffered in their respective shards for consumption. Unlike queue-based processing, the records are buffered until the data retention period set on the stream elapses, enabling one or more consumers to replay all in the messages in the shards of the stream. If your application must deliver your records to a data lake, data warehouse, Elasticsearch Service cluster, or Splunk, Kinesis Data Firehose can natively deliver your records to the following without needing to write any custom code:

  • Amazon S3
  • Amazon Redshift
  • Amazon Elasticsearch Service
  • Splunk

You simply indicate the desired delivery destination and configuration regarding how to batch and deliver the messages. Kinesis Data Firehose can also use your configured S3 desired object naming, Amazon Redshift table name, Amazon Elasticsearch index name, and more.

For custom processing or destinations outside of the Amazon Kinesis Data Firehose supported services above, you will need to write and execute custom code to consume data from the stream. Though you can use the Kinesis Client Library (KCL) to run your own custom processing application on persistent virtual machines or container instances, AWS Lambda offers serverless computing with native event source integration with Amazon Kinesis Data Streams. AWS Lambda as a stream consumer takes care of the operational overhead of reading shards, maintaining record order, check pointing as records are processed, and parallelizing processing.

Serverless stream processing with AWS Lambda

When configured with a Kinesis Stream as its event source, AWS Lambda continuously polls every shard in your stream at no extra charge and only invokes your Lambda code if and when there are messages in the stream. It additionally scales up the number of concurrent executions to parallelize reading all shards of a stream at the same time (and can have multiple executions reading the same shard simultaneously for a higher parallelization factor, if desired). AWS Lambda automatically checkpoints which records were successfully processed and handles retries and any failures automatically according to your desired configuration.

Best of all, there is no additional cost of the Lambda service handling all of these operational needs for you. You only pay for compute time when your function is invoked and messages are available on the stream for processing. You’re able to focus on processing your data with your business logic directly in your code since your records are sent as an array to your Lambda code. There is no additional code to author/manage regarding checkpointing, shard splits/merges, or other complexities.

Conclusion

In this blog, we defined streaming data and explored the Amazon Kinesis service and its various capabilities. We then reviewed the various options available for producing and consuming real-time streaming data with Amazon Kinesis, including using AWS Lambda for serverless streaming data processing. Please refer to the following resources for further learning on AWS streaming data processing:

Using VPC Sharing for a Cost-Effective Multi-Account Microservice Architecture

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/architecture/using-vpc-sharing-for-a-cost-effective-multi-account-microservice-architecture/

Introduction

Many cloud-native organizations building modern applications have adopted a microservice architecture because of its flexibility, performance, and scalability. Even customers with legacy and monolithic application stacks are embarking on an application modernization journey and opting for this type of architecture. A microservice architecture allows applications to be composed of several loosely coupled discreet services that are independently deployable, scalable, and maintainable. These applications can comprise a large number of microservices, which often span multiple business units within an organization. These customers typically have a multi-account AWS environment with each AWS account belonging to an individual business unit. Their microservice implementations reside in the Virtual Public Clouds (VPCs) of their respective AWS accounts. You can set up multi-account AWS environment incorporating best practices using AWS Landing Zone or AWS Control Tower.

This type of multi-account, multi-VPC architecture provides a good boundary and isolation for individual microservices and achieves a highly available, scalable, and secure architecture. However, for microservices that require a high degree of interconnectivity and are within the same trust boundaries, you can use other AWS capabilities to optimize cost and network management complexity.

This blog presents a cost-effective approach that requires less VPC management while still using separate accounts for billing and access control. This approach does not sacrifice scalability, high availability, fault tolerance, and security. To achieve a similar microservice architecture, you can share a VPC across AWS accounts using AWS Resource Access Manager (AWS RAM) and Network Load Balancer (NLB) support in a shared Amazon Virtual Private Cloud (VPC). This allows multiple microservices to coexist in the same VPC, even though they are developed by different business units.

Microservices architecture in a multi-VPC approach

In this architecture, microservices deployed across multiple VPCs use privately exposed endpoints for better security posture instead of going over the internet. This requires the customers to enable inter-VPC communication using the various networking capabilities of AWS as shown below:

microservices deployed across multiple VPCs use privately exposed endpoints

In the above reference architecture, we created a VPC in Account A, which is hosting the front end of the application across a fleet of Amazon Elastic Compute Cloud (Amazon EC2) instances using an AWS Auto Scaling group. For simplicity, we’ve illustrated a single public and private subnet for the application front end. In reality, this spans across multiple subnets across multiple Availability Zones (AZ) to support a highly available and fault-tolerant configuration.

To ensure security, the application must communicate privately to microservices mS1 and mS2 deployed in VPC of Account B and Account C respectively. For high availability, these microservices are also implemented using a fleet of Amazon EC2 instances with the Auto Scaling group spanning across multiple subnets/availability zones. For high-performance load balancing, they are fronted by a Network Load Balancer.

While this architecture shows an implementation using Amazon EC2, it can also use containerized services deployed using Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). These microservices may have interdependencies and invoke each other’s’ APIs for servicing the requests of the application layer. This application to mS and mS to mS communication can be achieved using following possible connectivity options:

When only few VPC interconnections are required, Amazon VPC peering and AWS PrivateLink may be a viable option. For higher number of VPC interconnections, we recommend AWS Transit Gateway for better manageability of connections and routing through a centralized resource. However, based on the amount of traffic this can introduce significant costs to your architecture.

Alternative approach to microservice architecture using Network Load Balancers in a shared VPC

The above architecture pattern allows your individual microservice teams to continue to own their AWS resources that host their microservice implementation. But they can deploy them in a shared VPC owned by the central account, eliminating the need for inter-VPC network connections. You can share Amazon VPCs to use the implicit routing within a VPC for applications that require a high degree of interconnectivity and are within the same trust boundaries.

This architecture uses AWS RAM, which allows you to share the VPC Subnets from AWS Account A to participating AWS accounts within your AWS organization. When the subnets are shared, participant AWS accounts (Account B and Account C) can see the shared subnets in their own environment. They can then deploy their Amazon EC2 instances in those subnets. This is depicted in the diagram where the visibility of the shared subnets (SS1 and SS2) is extended to the participating accounts (Account B and Account C).

You can also deploy the NLB in these shared subnets. Then, each participant account owns all the AWS resources for their microservice stack, but it’s deployed in the VPC of Account A.

This allows your individual microservice teams to maintain control over load balancer configurations and Auto Scaling policies based for their specific microservices’ needs. At the same time, using the AWS RAM they are able to effectively use the existing VPC environment of Account A.

This architecture presents several benefits over the multi-VPC architecture discussed earlier:

  • You can deploy the entire application, including the individual microservices, into a single shared VPC. This is while still allowing individual microservice teams control over their AWS resources deployed in that VPC.
  • Since the entire architecture now resides in a single VPC, it doesn’t require other networking connectivity features. It can rely on intra-VPC traffic for communication between the application (API) layer and microservices.
  • This leads to reduction in cost of the architecture. While the AWS RAM functionality is free of charge, this also reduces the data transfer and per-connection costs incurred by other options such as VPC peering, AWS PrivateLink, and AWS Transit Gateway.
  • This maintains the isolation across the individual microservices and the application layer.  Participants can’t view, modify, or delete resources that belong to others or the VPC owner.
  • This also leads to effective utilization of your VPC CIDR block resources.
  • Since multiple subnets belonging to different Availability Zones are shared, the application and individual mS continues to take advantage of scalability, availability, and fault tolerance.

The following illustration shows how you can configure AWS RAM to set up the VPC subnet resource shares between owner Account A and participating Account B. The example below shows the sharing of private subnet SS1 using this method:

(Click for larger image)

Accounts A and B Resource Share

Once this subnet is shared, the participating Account B can launch its Network Load Balancer of its microservice ms1 in the shared VPC subnet as shown below:

Account B can launch its Network Load Balancer of its microservice ms1 in the shared VPC subnet

While this architecture has many advantages, there are important considerations:

  • This style of architecture is suitable when you are certain that the number of microservices is small enough to coexist in a single VPC without depleting the CIDR block of the shared subnets of the VPC.
  • If the traffic between these microservices is in-significant, then the cost benefit of this architecture over other options may not be substantial. This is due to the effect of traffic flow on data transfer cost.

Conclusion

AWS Cloud provides several options to build a microservices architecture. It is important to look at the characteristics of your application to determine which architectural choices top opt for. The AWS RAM and the ability to deploy AWS resources (including Network Load Balancers in shared VPC) helps you eliminate inter-VPC traffic and associated networking costs. And this without sacrificing high availability, scalability, fault tolerance, and security for your application.

Formula 1: Using Amazon SageMaker to Deliver Real-Time Insights to Fans

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/formula-1-using-amazon-sagemaker-to-deliver-real-time-insights-to-fans-live/

The Formula one Group (F1) is responsible for the promotion of the FIA Formula One World Championship, a series of auto racing events in 21 countries where professional drivers race single-seat cars on custom tracks or through city courses in pursuit of the World Championship title.

Formula 1 works with AWS to enhance its race strategies, data tracking systems, and digital broadcasts through a wide variety of AWS services—including Amazon SageMaker, AWS Lambda, and AWS analytics services—to deliver new race metrics that change the way fans and teams experience racing.

In this special live segment of This is My Architecture, you’ll get a look at what’s under the hood of Formula 1’s F1 Insights. Hear about the machine learning algorithms the company trains on Amazon SageMaker and how inferences are made during races to deliver insights to fans.

For more content like this, subscribe to our YouTube channels This is My Architecture, This is My Code, and This is My Model, or visit the This is My Architecture AWS website, which has search functionality and the ability to filter by industry, language, and service.

Delve into the Forumla 1 case study to learn more about how AWS fuels analytics through machine learning.

AWS Architecture Monthly Magazine: Data Lakes

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/aws-architecture-monthly-magazine-data-lakes/

A data lake is the fastest way to get answers from all your data to all your users. It’s a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning—to guide better decisions.

In This Issue

In this month’s Architecture Monthly, we speak to AWS Analytics Tech Leader, Taz Sayed, about general architecture trends in data lakes, the questions customers need to ask themselves before considering a data lake, and we get his outlook on the role the cloud will play in future development efforts.

We also introduce you to two companies that are utilizing data lakes for deep analytics, point you to an AWS managed solution, provide some real-world videos, and more.

  • Ask an Expert: Taz Sayed, Tech Leader, AWS Analytics
  • Blog: Kayo Sports builds real-time view of the customer on AWS
  • Case Study: Yulu Uses a Data Lake on AWS to Pedal a Change
  • Solution: Data Lake on AWS
  • Managed Solution: AWS Lake Formation
  • Whitepaper: Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility

How to Access the Magazine

We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us star rating and comment on the Amazon Kindle Newsstand page or contact us anytime at [email protected].

Introduction to Messaging for Modern Cloud Architecture

Post Syndicated from Sam Dengler original https://aws.amazon.com/blogs/architecture/introduction-to-messaging-for-modern-cloud-architecture/

We hope you’ve enjoyed reading our posts on best practices for your serverless applications. The posts in the following series will focus on best practices when introducing messaging patterns into your applications. Let’s review some core messaging concepts and see how they can be used to address challenges when designing modern cloud architectures.

Introduction

Applications can communicate information with each other using messages, a mechanism for packaging a data payload and associated metadata. The application that sends a message is called the producer and the application that receives the message is called the consumer. Producers and consumers exchange messages using a variety of transportation channels, for example point-to-point requests, message queues, subscription topics, or event buses. These transportation channels have differently characteristics that make them useful when implementing message communication patterns. Dependencies emerge when producers and consumers exchange messages, which is called coupling.

Synchronous Communication

synchronous communication

Message communication is called synchronous when the producer sends a message to the consumer and waits for a response before the producer continues its processing logic. An example of synchronous communication over a point-to-point channel is when a HTTP client makes a request to a HTTP service, waits for the service to process the request, and then applies logic to the HTTP response to determine how to proceed.

Synchronous communication patterns are more straightforward to implement, however they create tight coupling between producers and consumers. Tight coupling can cause problems due to traffic spikes and failures propagating directly throughout the application. For example, in a three-tier architecture, when the application experiences a spike in client traffic, the web tier directly translates the traffic spike as pressure on downstream resources (the logic and data tiers), which may not scale to meet the sudden demand. Likewise, downstream resource failure in the logic or data tier directly impacts the web tier from responding to client requests. Applications can mimic a synchronous experience, for example a status spinner, using asynchronous communication with a polling or push notification strategy.

Asynchronous Communication

Asynchronous communication

Message communication is called asynchronous when the producer sends a message to the consumer and proceeds without waiting for the response. An example of asynchronous communication over a message queue channel is when a client publishes a message to a queue, and after the queue acknowledges receipt of the message, the publisher proceeds without waiting for the consumer to process the message.

Asynchronous communication patterns are implemented using transportation channels such as queues, topics, and event buses to create loose coupling between producers and consumers. Loose coupling increases an architecture’s resiliency to failure and ability to handle traffic spikes because it creates an indirection between producer and consumer communication, enabling them to operate independently of each other. Using the three-tier architecture example, a message queue can be introduced between the web, logic, and data tiers to enable each to scale independently of each other. When the application experiences a spike in client traffic, the web tier translates the traffic spike as more messages to the queue for processing, however the logic tier may continue to process messages off the queue without being directly impacted.

Considerations and Next Steps

Although asynchronous communication patterns can benefit modern cloud architectures, there are tradeoffs to consider. Asynchronous messaging adds latency to end-to-end processing time due to the addition of middleware. Producers and consumers take a dependency on the middleware stack, which must also scale to meet demand and be resilient to failure. Care must be taken to appropriately configure producers, consumers, and middleware to handle errors so that messages are not lost, more monitoring is required to ensure proper operations, and multiple logs must be correlated to troubleshoot and diagnose problems.

Amazon MQ, Amazon KinesisAmazon Simple Queue Service (SQS), Amazon Simple Notification Service (SNS), and Amazon EventBridge are highly available, large scale, failure resistant managed services that can be used to implement asynchronous messaging patterns. You can explore these services at the AWS Messaging page and their integration into Serverless Architectures in the free new digital course, Architecting Serverless Solutions. You can also visit the AWS Event-Driven Architecture page to learn how to apply messaging patterns to build event-driven solutions. The upcoming posts in this series will explore these AWS services to help ensure message patterns are implemented using best practices when applied to modern cloud architecture.