Tag Archives: Amazon RDS

How ZS created a multi-tenant self-service data orchestration platform using Amazon MWAA

Post Syndicated from Manish Mehra original https://aws.amazon.com/blogs/big-data/how-zs-created-a-multi-tenant-self-service-data-orchestration-platform-using-amazon-mwaa/

This is post is co-authored by Manish Mehra, Anirudh Vohra, Sidrah Sayyad, and Abhishek I S (from ZS), and Parnab Basak (from AWS). The team at ZS collaborated closely with AWS to build a modern, cloud-native data orchestration platform.

ZS is a management consulting and technology firm focused on transforming global healthcare and beyond. We leverage our leading-edge analytics, plus the power of data, science, and products, to help our clients make more intelligent decisions, deliver innovative solutions, and improve outcomes for all. Founded in 1983, ZS has more than 12,000 employees in 35 offices worldwide.

ZAIDYNTM by ZS is an intelligent, cloud-native platform that helps life sciences organizations shape the future. Its analytics, algorithms, and workflows empower people, transform processes, and unlock real value. Designed to learn and grow with our clients, the platform is modular, future-ready, and fueled by global connectivity. And as more people engage, share, and build, our platform gets smarter—helping organizations fuel discovery, connect with customers, deliver treatments, and improve lives. ZAIDYN is helping companies of all sizes gain fluency in the full spectrum of life sciences so they can move faster, together through its Data & Analytics, Customer Engagement, Field Performance and Clinical Development offerings.

ZAIDYN Data & Analytics apps provide business users with self-service tools to innovate and scale insights delivery across the enterprise. ZAIDYN Data Hub (a part of the Data & Analytics product category) provides self-service options for guided workflows, data connectors, quality checks, and more. The elastic data processing offered by AWS helps prioritize processing speeds.

Data Hub customers wanted a one-stop solution for managing their data pipelines. A solution that does not require end users to gain additional knowledge about the nitty-gritties of the tool, one which is easy for users to get onboarded on, thereby increasing the demand for data orchestration capabilities within the application. A few of the sophisticated asks like start and stop of workflows, maintaining history of past runs, and providing real-time status updates for individual tasks of the workflow became increasingly important for end clients. We needed a mature orchestration tool, which led us to Amazon Managed Workflows for Apache Airflow (Amazon MWAA).

Amazon MWAA is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale.

In this post, we share how ZS created a multi-tenant self-service data orchestration platform using Amazon MWAA.

Why we chose Amazon MWAA

Choosing the right orchestration tool was critical for us because we had to ensure that the service was operationally efficient and cost-effective, provided high availability, had extensive features to support our business cases, and yet was easy to adapt for our end-users (data engineers). We evaluated and experimented among Amazon MWAA, Azkaban on Amazon EMR, and AWS Step Functions before project initiation.

The following benefits of Amazon MWAA convinced us to adopt it:

  • AWS managed service – With Amazon MWAA, we don’t have to manage the underlying infrastructure for scalability and availability to maintain quality of service. The built-in autoscaling mechanism of Amazon MWAA automatically increases the number of Apache Airflow workers in response to running and queued tasks, and disposes of extra workers when there are no more tasks queued or running. The default environment is already built for high availability with multiple Airflow schedulers and workers, and the metadata database distributed across multiple Availability Zones. We also evaluated hosting open-source Airflow on our ZS infrastructure. However, due to infrastructure maintenance overhead and the high investment needed to make and maintain it at production grade, we decided to drop that option.
  • Security – With Amazon MWAA, our data is secure by default because workloads run in our own isolated and secure cloud environment using Amazon Virtual Private Cloud (Amazon VPC), and data is automatically encrypted using AWS Key Management Service (AWS KMS). We can control role-based authentication and authorization for Apache Airflow’s user interface via AWS Identity and Access Management (IAM), providing users single sign-on (SSO) access for scheduling and viewing workflow runs.
  • Compatibility and active community support – Amazon MWAA hosts the same open-source Apache Airflow version without any forks. The open-source community for Apache Airflow is very active with multiple commits, files changes, issue resolutions, and community advice.
  • Language and connector support – The flow definitions for Apache Airflow are based on Python, which is easy for our engineers to adapt. An extensive list of features and connectors is available out of the box in Amazon MWAA, including connectors for Hive, Amazon EMR, Livy, and Kubernetes. We needed to run all our Data Hub jobs (ingestion, applying custom rules and quality checks, or exporting data to third-party systems) on Amazon EMR. The necessary Amazon EMR operators are already available as a part of the Amazon-provided package for Airflow (apache-airflow-providers-amazon), which we could supplement rather than construct one from the ground up.
  • Cost – Cost was the most important aspect for us when adopting Amazon MWAA. Amazon MWAA is useful for those who are running thousands of tasks in the prod environment, which is why we decided to the make the Amazon MWAA environment multi-tenant such that the cost can be shared among clients. With our large Amazon MWAA environment, we only pay for what we use, with no minimum fees or upfront commitments. We estimated paying less than $1,000 per month, combined for our environment usage and additional worker instance pricing, yet achieve the scale of being able to run 200 concurrent tasks running 3 hours per day over 10 concurrent workers. This meant reduced operational costs and engineering overhead while meeting the on-demand monitoring needs of end-to-end data pipeline orchestration.

Solution overview

The following diagram illustrates the solution architecture.

We have a common control tier account where we host our software as a service application (Data Hub) on Amazon Elastic Compute Cloud (Amazon EC2) instances. Each client has their own version of this application deployed on this shared infrastructure. Amazon MWAA is also hosted in the same common control tier account. The control tier account has connectivity with tenant-specific AWS accounts. This is to maintain strong physical isolation of client data by segregating the AWS accounts for each client. Each client-specific account hosts EMR clusters where data processing takes place. When a processing job is complete, data may reside on Amazon EMR (an HDFS cluster) or on Amazon Simple Storage Service (Amazon S3), an EMRFS cluster, depending on configuration. The DAG files generated by our Data Hub application contain metadata of the processes, and don’t contain any sensitive client information. When a job is submitted from Data Hub, the API request contains tenant-specific information needed to pull up the corresponding AWS connection details, which are stored as Airflow connection objects. These connection details are consumed by our custom implementation of Airflow EMR step operators (add and watch) to perform operations on the tenant EMR clusters.

Because the data orchestration capability is an application offering, the client teams create their processes on the Data Hub UI and don’t have access to the underlying Amazon MWAA environment.

The following screenshot shows how an end-user can configure Data Hub process on the application UI.

How Data Hub processes map to Amazon MWAA DAGs

Data Hub processes map to Amazon MWAA DAGs as follows:

  • Each process in Data Hub corresponds to a DAG in Amazon MWAA, and each component is a task (denoted by Sn​) that is submitted as a step on the client EMR clusters.
  • The application generates the DAG file dynamically and updates it on the S3 bucket linked to the Amazon MWAA environment.
  • Parsing dedicated structures representing a given process and submitting or tracking the Amazon EMR steps is abstracted from the end-user. Dynamic DAG generation is responsible for using the latest version of the underlying components and helps in managing the DAG schedule.
  • Some Airflow tasks are created as a part of the DAG, which take care of interacting with the application APIs to ensure that the required metadata is captured in a separate Amazon Relational Database Service (Amazon RDS) database instance.

A user can trigger a given process to run from the Data Hub UI or can schedule it to run at a specified time. Because a single Amazon MWAA environment is responsible for the data orchestration needs of multiple clients, our DAG decode logic ensures that the correct EMR cluster ID and Airflow connection ID are picked up at runtime. The configs responsible for storing these details are placed and updated on the S3 buckets via an automated deployment pipeline. A dedicated connection ID is created per client in Airflow, which is then utilized in our custom implementation of EmrAddStepsOperator. The connection ID captures the Region and role ARN to be assumed to interact with the EMR cluster in the client account. These cross-account roles have access to limited resources in each client account, following the principle of least privilege.

Generating a DAG from a process defined on Data Hub UI

Our front-end application is built using Angular (version 11) and uses a third-party library that facilitates drag-and-drop of components from the left pane on a canvas. Components are stitched together with connections defining dependencies to form a process. This process is translated by our custom engine to generate a dynamic Airflow DAG. A sample DAG generated from the preceding example process defined on the UI looks like the following figure.

We wrap the DAG by PEntry and PExit Python operators, and for each of the components on the Data Hub UI, we create two tasks: Cn and Wn.

The relevant terms for this solution are as follows:

  • PEntry​ – The Python operator used to insert an entry in the RDS database that the process run has started via API call.​
  • Cn– The ZS custom implementation of EMRAddStepsOperator used to submit a job (Data Hub component) on a running EMR cluster.​ This is followed by an API call to insert an entry in the database that the component job has started.​
  • Wn– The custom implementation of Airflow Watcher (EmrStepSensor), which checks the status of the step from our metadata database.​
  • PExit​ – The Python operator used to update an entry in the RDS database (more of a finally block) via API call.​

Lessons learned during the implementation

When implementing this solution, we learned the following:

  • We faced challenges in being able to consistently predict when a DAG will be parsed and made available in the Airflow UI in Amazon MWAA after the DAG file is synced to the linked S3 bucket. Depending on how complex the DAG is, it could happen within seconds or several minutes. Due to the lack of availability of an API or AWS Command Line Interface (AWS CLI) command to ascertain this, we put in some blanket restrictions (delay) on user operations from our UI to overcome this limitation.
  • Within Airflow, data pipelines are represented by DAGs, and these DAGs change over time as business needs evolve. A key challenge faced by Airflow users is looking at how a DAG was run in the past, and when it was replaced by a newer version of the DAG. This is because within Airflow (as of this writing), only the current (latest) version of the DAG is represented within the user interface, without any reference to prior versions of the DAG. To overcome this limitation, we implemented a backend way of generating a DAG from the available metadata, and use it to version control over runs.
  • Airflow CLI commands when invoked in DAGs always return an HTTP 200 response. You can’t solely rely on the HTTP response code to ascertain the status of commands. We applied additional parsing logic (particularly to analyze the errors on failure) to determine the true status of commands.
  • Airflow doesn’t have a command to gracefully stop a DAG that is currently running. You can stop a DAG (unmark as running) and clear the task’s state or even delete it in the UI. The actual running tasks in the executor won’t stop, but might be stopped if the executor realizes that it’s not in the database anymore.

Conclusion

Amazon MWAA sets up Apache Airflow for you using the same Apache Airflow user interface and open-source code. With Amazon MWAA, you can use Airflow and Python to create workflows without having to manage the underlying infrastructure for scalability, availability, and security. Amazon MWAA automatically scales its workflow run capacity to meet your needs, and is integrated with AWS security services to help provide you with fast and secure access to your data. In this post, we discussed how you can build a bridge tenancy isolation model with a central Amazon MWAA orchestrating task against independent infrastructure stacks in dedicated accounts deployed for each of your tenants. Through a custom UI, you can enable self-service workflow runs via Airflow dynamic DAGs using the power and flexibility of Python. This enables you to achieve economies of scale and operational efficiency while meeting your regulatory, security, and cost considerations.


About the Authors

Manish Mehra is a Software Architect, working with the SD group in ZS. He has more than 11 years of experience working in banking, gaming, and life science domains. He is currently looking into the architecture of the Data & Analytics product category of the ZAIDYN Platform. He has expertise in full-stack application development and building robust, scalable, enterprise-grade big data applications.

Anirudh Vohra is a Director of Cloud Architecture, working within the Cloud Center of Excellence space at ZS. He is passionate about being a developer advocate for internal engineering teams, also designing and building cloud platforms and abstractions to empower developers and troubleshoot complex systems.

Abhishek I S is Associate Cloud Architect at ZS Associates working within the Cloud Centre of Excellence space. He has diverse experience ranging from application development to cloud engineering. Currently, he is primarily focusing on architecture design and automation for the cloud-native solutions of various ZS products.

Sidrah Sayyad is an Associate Software Architect at ZS working within the Software Development (SD) group. She has 9 years of experience, which includes working on identity management, infrastructure management, and ETL applications. She is passionate about coding and helps architect and build applications to achieve business outcomes.

Parnab Basak is a Solutions Architect and a Serverless Specialist at AWS. He specializes in creating new solutions that are cloud native using modern software development practices like serverless, DevOps, and analytics. Parnab was closely involved with the engagement with ZS, providing architectural guidance as well as helping the team overcome technical challenges during the implementation.

AWS Week in Review – August 29, 2022

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-week-in-review-august-29-2022/

I’ve just returned from data and machine learning (ML) conferences in Los Angeles and San Francisco, California. It’s been great to chat with customers and developers about the latest technology trends and use cases. This past week has also been packed with launches at AWS.

Last Week’s Launches
Here are some launches that got my attention during the previous week:

Amazon QuickSight announces fine-grained visual embedding. You can now embed individual visuals from QuickSight dashboards in applications and portals to provide key insights to users where they’re needed most. Check out Donnie’s blog post to learn more, and tune into this week’s The Official AWS Podcast episode.

Sample Web App with a Visual

Sample Web App with a Visual

Amazon SageMaker Automatic Model Tuning is now available in the Europe (Milan), Africa (Cape Town), Asia Pacific (Osaka), and Asia Pacific (Jakarta) Regions. In addition, SageMaker Automatic Model Tuning now reuses SageMaker Training instances to reduce start-up overheads by 20x. In scenarios where you have a large number of hyperparameter evaluations, the reuse of training instances can cumulatively save 2 hours for every 50 sequential evaluations.

Amazon RDS now supports setting up connectivity between your RDS database and EC2 compute instance in one click. Amazon RDS automatically sets up your VPC and related network settings during database creation to enable a secure connection between the EC2 instance and the RDS database.

In addition, Amazon RDS for Oracle now supports managed Oracle Data Guard Switchover and Automated Backups for replicas. With the Oracle Data Guard Switchover feature, you can reverse the roles between the primary database and one of its standby databases (replicas) with no data loss and a brief outage. You can also now create Automated Backups and manual DB snapshots of an RDS for Oracle replica, which reduces the time spent taking backups following a role transition.

Amazon Forecast now supports what-if analyses. Amazon Forecast is a fully managed service that uses ML algorithms to deliver highly accurate time series forecasts.  You can now use what-if analyses to quantify the potential impact of business scenarios on your demand forecasts.

AWS Asia Pacific (Jakarta) Region now supports additional AWS services and EC2 instance types – Amazon SageMaker, AWS Application Migration Service, AWS Glue, Red Hat OpenShift Service on AWS (ROSA), and Amazon EC2 X2idn and X2iedn instances are now available in the Asia Pacific (Jakarta) Region.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional news, blog posts, and fun code competitions you may find interesting:

Scaling AI and Machine Learning Workloads with Ray on AWS – This past week, I attended Ray Summit in San Francisco, California, and had great conversations with the community. Check out this blog post to learn more about AWS contributions to the scalability and operational efficiency of Ray on AWS.

Ray on AWS

New AWS Heroes – It’s great to see both new and familiar faces joining the AWS Heroes program, a worldwide initiative that acknowledges individuals who have truly gone above and beyond to share knowledge in technical communities. Get to know them in the blog post!

DFL Bundesliga Data ShootoutDFL Deutsche Fußball Liga launched a code competition, powered by AWS: the Bundesliga Data Shootout. The task: Develop a computer vision model to classify events on the pitch. Join the competition as an individual or in a team and win prizes.

Become an AWS GameDay World Champion – AWS GameDay is an interactive, team-based learning experience designed to put your AWS skills to the test by solving real-world problems in a gamified, risk-free environment. Developers of all skill levels can get in on the action, to compete for worldwide glory, as well as a chance to claim the top prize: an all-expenses-paid trip to AWS re:Invent Las Vegas 2022!

Learn more about the AWS Impact Accelerator for Black Founders from one of the inaugural members of the program in this blog post. The AWS Impact Accelerator is a series of programs designed to help high-potential, pre-seed start-ups led by underrepresented founders succeed.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS SummitAWS Global Summits – AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS.

Registration is open for the following in-person AWS Summits that might be close to you in August and September: Canberra (August 31), Ottawa (September 8), New Delhi (September 9), and Mexico City (September 21–22), Bogotá (October 4), and Singapore (October 6).

AWS Community DayAWS Community DaysAWS Community Day events are community-led conferences that deliver a peer-to-peer learning experience, providing developers with a venue for them to acquire AWS knowledge in their preferred way: from one another.

In September, the AWS community will host events in the Bay Area, California (September 9) and in Arlington, Virginia (September 30). In October, you can join Community Days in Amersfoort, Netherlands (October 3), in Warsaw, Poland (October 14), and in Dresden, Germany (October 19).

That’s all for this week. Check back next Monday for another Week in Review! And maybe I’ll see you at the AWS Community Day here in the Bay Area!

Antje

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Week in Review – August 8, 2022

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/aws-week-in-review-august-8-2022/

As an ex-.NET developer, and now Developer Advocate for .NET at AWS, I’m excited to bring you this week’s Week in Review post, for reasons that will quickly become apparent! There are several updates, customer stories, and events I want to bring to your attention, so let’s dive straight in!

Last Week’s launches
.NET developers, here are two new updates to be aware of—and be sure to check out the events section below for another big announcement:

Tiered pricing for AWS Lambda will interest customers running large workloads on Lambda. The tiers, based on compute duration (measured in GB-seconds), help you save on monthly costs—automatically. Find out more about the new tiers, and see some worked examples showing just how they can help reduce costs, in this AWS Compute Blog post by Heeki Park, a Principal Solutions Architect for Serverless.

Amazon Relational Database Service (RDS) released updates for several popular database engines:

  • RDS for Oracle now supports the April 2022 patch.
  • RDS for PostgreSQL now supports new minor versions. Besides the version upgrades, there are also updates for the PostgreSQL extensions pglogical, pg_hint_plan, and hll.
  • RDS for MySQL can now enforce SSL/TLS for client connections to your databases to help enhance transport layer security. You can enforce SSL/TLS by simply enabling the require_secure_transport parameter (disabled by default) via the Amazon RDS Management console, the AWS Command Line Interface (AWS CLI), AWS Tools for PowerShell, or using the API. When you enable this parameter, clients will only be able to connect if an encrypted connection can be established.

Amazon Elastic Compute Cloud (Amazon EC2) expanded availability of the latest generation storage-optimized Is4gen and Im4gn instances to the Asia Pacific (Sydney), Canada (Central), Europe (Frankfurt), and Europe (London) Regions. Built on the AWS Nitro System and powered by AWS Graviton2 processors, these instance types feature up to 30 TB of storage using the new custom-designed AWS Nitro System SSDs. They’re ideal for maximizing the storage performance of I/O intensive workloads that continuously read and write from the SSDs in a sustained manner, for example SQL/NoSQL databases, search engines, distributed file systems, and data analytics.

Lastly, there’s a new URL from AWS Support API to use when you need to access the AWS Support Center console. I recommend bookmarking the new URL, https://support.console.aws.amazon.com/, which the team built using the latest architectural standards for high availability and Region redundancy to ensure you’re always able to contact AWS Support via the console.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here’s some other news items and customer stories that you may find interesting:

AWS Open Source News and Updates – Catch up on all the latest open-source projects, tools, and demos from the AWS community in installment #123 of the weekly open source newsletter.

In one recent AWS on Air livestream segment from AWS re:MARS, discussing the increasing scale of machine learning (ML) models, our guests mentioned billion-parameter ML models which quite intrigued me. As an ex-developer, my mental model of parameters is a handful of values, if that, supplied to methods or functions—not billions. Of course, I’ve since learned they’re not the same thing! As I continue my own ML learning journey I was particularly interested in reading this Amazon Science blog on 20B-parameter Alexa Teacher Models (AlexaTM). These large-scale multilingual language models can learn new concepts and transfer knowledge from one language or task to another with minimal human input, given only a few examples of a task in a new language.

When developing games intended to run fully in the cloud, what benefits might there be in going fully cloud-native and moving the entire process into the cloud? Find out in this customer story from Return Entertainment, who did just that to build a cloud-native gaming infrastructure in a few months, reducing time and cost with AWS services.

Upcoming events
Check your calendar and sign up for these online and in-person AWS events:

AWS Storage Day: On August 10, tune into this virtual event on twitch.tv/aws, 9:00 AM–4.30 PM PT, where we’ll be diving into building data resiliency into your organization, and how to put data to work to gain insights and realize its potential, while also optimizing your storage costs. Register for the event here.

AWS SummitAWS Global Summits: These free events bring the cloud computing community together to connect, collaborate, and learn about AWS. Registration is open for the following AWS Summits in August:

AWS .NET Enterprise Developer Days 2022 – North America: Registration for this free, 2-day, in-person event and follow-up 2-day virtual event opened this past week. The in-person event runs September 7–8, at the Palmer Events Center in Austin, Texas. The virtual event runs September 13–14. AWS .NET Enterprise Developer Days (.NET EDD) runs as a mini-conference within the DeveloperWeek Cloud conference (also in-person and virtual). Anyone registering for .NET EDD is eligible for a free pass to DeveloperWeek Cloud, and vice versa! I’m super excited to be helping organize this third .NET event from AWS, our first that has an in-person version. If you’re a .NET developer working with AWS, I encourage you to check it out!

That’s all for this week. Be sure to check back next Monday for another Week in Review roundup!

— Steve
This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Graviton Fast Start – A New Program to Help Move Your Workloads to AWS Graviton

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/graviton-fast-start-a-new-program-to-help-move-your-workloads-to-aws-graviton/

With the Graviton Challenge last year, we helped customers migrate to Graviton-based EC2 instances and get up to 40 percent price performance benefit in as little as 4 days. Tens of thousands of customers, including 48 of the top 50 Amazon Elastic Compute Cloud (Amazon EC2) customers, use AWS Graviton processors for their workloads. In addition to EC2, many AWS managed services can run their workloads on Graviton. For most customers, adoption is easy, requiring minimal code changes. However, the effort and time required to move workloads to Graviton depends on a few factors including your software development environment and the technology stack on which your application is built.

This year, we want to take it a step further and make it even easier for customers to adopt Graviton not only through EC2, but also through managed services. Today, we are launching AWS Graviton Fast Start, a new program that makes it even easier to move your workloads to AWS Graviton by providing step-by-step directions for EC2 and other managed services that support the Graviton platform:

  • Amazon Elastic Compute Cloud (Amazon EC2) – EC2 provides the most flexible environment for a migration and can support many kinds of workloads, such as web apps, custom databases, or analytics. You have full control over the interpreted or compiled code running in the EC2 instance. You can also use many open-source and commercial software products that support the Arm64 architecture.
  • AWS Lambda – Migrating your serverless functions can be really easy, especially if you use an interpreted runtime such as Node.js or Python. Most of the time, you only have to check the compatibility of your software dependencies. I have shown a few examples in this blog post.
  • AWS Fargate – Fargate works best if your applications are already running in containers or if you are planning to containerize them. By using multi-architecture container images or images that have Arm64 in their image manifest, you get the serverless benefits of Fargate and the price-performance advantages of Graviton.
  • Amazon Aurora – Relational databases are at the core of many applications. If you need a database compatible with PostgreSQL or MySQL, you can use Amazon Aurora to have a highly performant and globally available database powered by Graviton.
  • Amazon Relational Database Service (RDS) – Similarly to Aurora, Amazon RDS engines such as PostgreSQL, MySQL, and MariaDB can provide a fully managed relational database service using Graviton-based instances.
  • Amazon ElastiCache – When your workload requires ultra-low latency and high throughput, you can speed up your applications with ElastiCache and have a fully managed in-memory cache running on Graviton and compatible with Redis or Memcached.
  • Amazon EMR – With Amazon EMR, you can run large-scale distributed data processing jobs, interactive SQL queries, and machine learning applications on Graviton using open-source analytics frameworks such as Apache SparkApache Hive, and Presto.

Here’s some feedback we got from customers running their workloads on Graviton:

  • Formula 1 racing told us that Graviton2-based C6gn instances provided the best price performance benefits for some of their computational fluid dynamics (CFD) workloads. More recently, they found that Graviton3 C7g instances are 40 percent faster for the same simulations and expect Graviton3-based instances to become the optimal choice to run all of their CFD workloads.
  • Honeycomb has 100 percent of their production workloads running on Graviton using EC2 and Lambda. They have tested the high-throughput telemetry ingestion workload they use for their observability platform against early preview instances of Graviton3 and have seen a 35 percent performance increase for their workload over Graviton2. They were able to run 30 percent fewer instances of C7g than C6g serving the same workload and with 30 percent reduced latency. With these instances in production, they expect over 50 percent price performance improvement over x86 instances.
  • Twitter is working on a multi-year project to leverage Graviton-based EC2 instances to deliver Twitter timelines. As part of their ongoing effort to drive further efficiencies, they tested the new Graviton3-based C7g instances. Across a number of benchmarks representative of their workloads, they found Graviton3-based C7g instances deliver 20-80 percent higher performance compared to Graviton2-based C6g instances, while also reducing tail latencies by as much as 35 percent. They are excited to utilize Graviton3-based instances in the future to realize significant price performance benefits.

With all these options, getting the benefits of running all or part of your workload on AWS Graviton can be easier than you expect. To help you get started, there’s also a free trial on the Graviton-based T4g instances for up to 750 hours per month through December 31st, 2022.

Visit AWS Graviton Fast Start to get step-by-step directions on how to move your workloads to AWS Graviton.

Danilo

How Munich Re Automation Solutions Ltd built a digital insurance platform on AWS

Post Syndicated from Sid Singh original https://aws.amazon.com/blogs/architecture/how-munich-re-automation-solutions-ltd-built-a-digital-insurance-platform-on-aws/

Underwriting for life insurance can be quite manual and often time-intensive with lots of re-keying by advisers before underwriting decisions can be made and policies finally issued. In the digital age, people purchasing life insurance want self-service interactions with their prospective insurer. People want speed of transaction with time to cover reduced from days to minutes. While this has been achieved in the general insurance space with online car and home insurance journeys, this is not always the case in the life insurance space. This is where Munich Re Automation Solutions Ltd (MRAS) offers its customers, a competitive edge to shrink the quote-to-fulfilment process using their ALLFINANZ solution.

ALLFINANZ is a cloud-based life insurance and analytics solution to underwrite new life insurance business. It is designed to transform the end consumer’s journey, delivering everything they need to become a policyholder. The core digital services offered to all ALLFINANZ customers include Rulebook Hub, Risk Assessment Interview delivery, Decision Engine, deep analytics (including predictive modeling capabilities), and technical integration services—for example, API integration and SSO integration.

Current state architecture

The ALLFINANZ application began as a traditional three-tier architecture deployed within a datacenter. As MRAS migrated their workload to the AWS cloud, they looked at their regulatory requirements and the technology stack, and decided on the silo model of the multi-tenant SaaS system. Each tenant is provided a dedicated Amazon Virtual Private Cloud (VPC) that holds network and application components, fully isolated from other primary insurers.

As an entry point into the ALLFINANZ environment, MRAS uses Amazon Route 53 to route incoming traffic to the appropriate Amazon VPC. The routing relies on a model where subdomains are assigned to each tenant, for example the subdomain allfinanz.tenant1.munichre.cloud is the subdomain for tenant 1. The diagram below shows the ALLFINANZ architecture. Note: not all links between components are shown here for simplicity.

Current high-level solution architecture for the ALLFINANZ solution

Figure 1. Current high-level solution architecture for the ALLFINANZ solution

  1. The solution uses Route 53 as the DNS service, which provides two entry points to the SaaS solution for MRAS customers:
    • The URL allfinanz.<tenant-id>.munichre.cloud allows user access to the ALLFINANZ Interview Screen (AIS). The AIS can exist as a standalone application, or can be integrated with a customer’s wider digital point-of -sale process.
    • The URL api.allfinanz.<tenant-id>.munichre.cloud is used for accessing the application’s Web services and REST APIs.
  2. Traffic from both entry points flows through the load balancers. While HTTP/S traffic from the application user access entry point flows through an Application Load Balancer (ALB), TCP traffic from the REST API clients flows through a Network Load Balancer (NLB). Transport Layer Security (TLS) termination for user traffic happens at the ALB using certificates provided by the AWS Certificate Manager.  Secure communication over the public network is enforced through TLS validation of the server’s identity.
  3. Unlike application user access traffic, REST API clients use mutual TLS authentication to authenticate a customer’s server. Since NLB doesn’t support mutual TLS, MRAS opted for a solution to pass this traffic to a backend NGINX server for the TLS termination. Mutual TLS is enforced by using self-signed client and server certificates issued by a certificate authority that both the client and the server trust.
  4. Authenticated traffic from ALB and NGINX servers is routed to EC2 instances hosting the application logic. These EC2 instances are hosted in an auto-scaling group spanning two Availability Zones (AZs) to provide high availability and elasticity, therefore, allowing the application to scale to meet fluctuating demand.
  5. Application transactions are persisted in the backend Amazon Relational Database Service MySQL instances. This database layer is configured across multi-AZs, providing high availability and automatic failover.
  6. The application requires the capability to integrate evidence from data sources external to the ALLFINANZ service. This message sharing is enabled through the Amazon MQ managed message broker service for Apache Active MQ.
  7. Amazon CloudWatch is used for end-to-end platform monitoring through logs collection and application and infrastructure metrics and alerts to support ongoing visibility of the health of the application.
  8. Software deployment and associated infrastructure provisioning is automated through infrastructure as code using a combination of Git, Amazon CodeCommit, Ansible, and Terraform.
  9. Amazon GuardDuty continuously monitors the application for malicious activity and delivers detailed security findings for visibility and remediation. GuardDuty also allows MRAS to provide evidence of the application’s strong security posture to meet audit and regulatory requirements.

High availability, resiliency, and security

MRAS deploys their solution across multiple AWS AZs to meet high-availability requirements and ensure operational resiliency. If one AZ has an ongoing event, the solution will remain operational, as there are instances receiving production traffic in another AZ. As described above, this is achieved using ALBs and NLBs to distribute requests to the application subnets across AZs.

The ALLFINANZ solution uses private subnets to segregate core application components and the database storage platform. Security groups provide networking security measures at the elastic network interface level. MRAS restrict access from incoming connection requests to ranges of IP addresses by attaching security groups to the ALBs. Amazon Inspector monitors workloads for software vulnerabilities and unintended network exposure. AWS WAF is integrated with the ALB to protect from SQL injection or cross-site scripting attacks on the application.

Optimizing the existing workload

One of the key benefits of this architecture is that now MRAS can standardize the infrastructure configuration and ensure consistent versioning of the workload across tenants. This makes onboarding new tenants as simple as provisioning another VPC with the same infrastructure footprint.

MRAS are continuing to optimize their architecture iteratively, examining components to modernize to cloud-native components and evolving towards the pool model of multi-tenant SaaS architecture wherever possible. For example, MRAS centralized their per-tenant NAT gateway deployment to a centralized outbound Internet routing design using AWS Transit Gateway, saving approximately 30% on their overall NAT gateway spend.

Conclusion

The AWS global infrastructure has allowed MRAS to serve more than 40 customers in five AWS regions around the world. This solution improves customers’ experience and workload maintainability by standardizing and automating the infrastructure and workload configuration within a SaaS model, compared with multiple versions for the on-premise deployments. SaaS customers are also freed up from the undifferentiated heavy lifting of infrastructure operations, allowing them to focus on their business of underwriting for life insurance.

MRAS used the AWS Well-Architected Framework to assess their architecture and list key recommendations. AWS also offers Well-Architected SaaS Lens and AWS SaaS Factory Program, with a collection of resources to empower and enable insurers at any stage of their SaaS on AWS journey.

AWS Week In Review – July 11, 2022

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-week-in-review-july-11/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

In France, we know summer has started when you see the Tour de France bike race on TV or in a city nearby. This year, the tour stopped in the city where I live, and I was blocked on my way back home from a customer conference to let the race pass through.

It’s Monday today, so let’s make another tour—a tour of the AWS news, announcements, or blog posts that captured my attention last week. I selected these as being of interest to IT professionals and developers: the doers, the builders that spend their time on the AWS Management Console or in code.

Last Week’s Launches
Here are some launches that got my attention during the previous week:

Amazon EC2 Mac M1 instances are generally available – this new EC2 instance type allows you to deploy Mac mini computers with M1 Apple Silicon running macOS using the same console, API, SDK, or CLI you are used to for interacting with EC2 instances. You can start, stop them, assign a security group or an IAM role, snapshot their EBS volume, and recreate an AMI from it, just like with Linux-based or Windows-based instances. It lets iOS developers create full CI/CD pipelines in the cloud without requiring someone in your team to reinstall various combinations of macOS and Xcode versions on on-prem machines. Some of you had the chance the enter the preview program for EC2 Mac M1 instances when we announced it last December. EC2 Mac M1 instances are now generally available.

AWS IAM Roles Anywhere – this is one of those incremental changes that has the potential to unlock new use cases on the edge or on-prem. AWS IAM Roles Anywhere enables you to use IAM roles for your applications outside of AWS to access AWS APIs securely, the same way that you use IAM roles for workloads on AWS. With IAM Roles Anywhere, you can deliver short-term credentials to your on-premises servers, containers, or other compute platforms. It requires an on-prem Certificate Authority registered as a trusted source in IAM. IAM Roles Anywhere exchanges certificates issued by this CA for a set of short-term AWS credentials limited in scope by the IAM role associated to the session. To make it easy to use, we do provide a CLI-based signing helper tool that can be integrated in your CLI configuration.

A streamlined deployment experience for .NET applications – the new deployment experience focuses on the type of application you want to deploy instead of individual AWS services by providing intelligent compute recommendations. You can find it in the AWS Toolkit for Visual Studio using the new “Publish to AWS” wizard. It is also available via the .NET CLI by installing AWS Deploy Tool for .NET. Together, they help easily transition from a prototyping phase in Visual Studio to automated deployments. The new deployment experience supports ASP.NET Core, Blazor WebAssembly, console applications (such as long-lived message processing services), and tasks that need to run on a schedule.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
This week, I also learned from these blog posts:

TLS 1.2 to become the minimum TLS protocol level for all AWS API endpointsthis article was published at the end of June, and it deserves more exposure. Starting in June 2022, we will progressively transition all our API endpoints to TLS 1.2 only. The good news is that 95 percent of the API calls we observe are already using TLS 1.2, and only five percent of the applications are impacted. If you have applications developed before 2014 (using a Java JDK before version 8 or .NET before version 4.6.2), it is worth checking your app and updating them to use TLS 1.2. When we detect your application is still using TLS 1.0 or TLS 1.1, we inform you by email and in the AWS Health Dashboard. The blog article goes into detail about how to analyze AWS CloudTrail logs to detect any API call that would not use TLS 1.2.

How to implement automated appointment reminders using Amazon Connect and Amazon Pinpoint this blog post guides you through the steps to implement a system to automatically call your customers to remind them of their appointments. This automated outbound campaign for appointment reminders checked the campaign list against a “do not call” list before making an outbound call. Your customers are able to confirm automatically or reschedule by speaking to an agent. You monitor the results of the calls on a dashboard in near real time using Amazon QuickSight. It provides you with AWS CloudFormation templates for the parts that can be automated and detailed instructions for the manual steps.

Using Amazon CloudWatch metrics math to monitor and scale resources AWS Auto Scaling is one of those capabilities that may look like magic at first glance. It uses metrics to take scale-out or scale-in decisions. Most customers I talk with struggle a bit at first to define the correct combination of metrics that allow them to scale at the right moment. Scaling out too late impacts your customer experience while scaling out too early impacts your budget. This article explains how to use metric math, a way to query multiple Amazon CloudWatch metrics, and use math expressions to create new time series based on these metrics. These math metrics may, in turn, be used to trigger scaling decisions. The typical use case would be to mathematically combine CPU, memory, and network utilization metrics to decide when to scale in or to scale out.

How to use Amazon RDS and Amazon Aurora with a static IP address – in the cloud, it is better to access network resources by referencing their DNS name instead of IP addresses. IP addresses come and go as resources are stopped, restarted, scaled out, or scaled in. However, when integrating with older, more rigid environments, it might happen, for a limited period of time, to authorize access through a static IP address. You have probably heard that scary phrase: “I have to authorize your IP address in my firewall configuration.” This new blog post explains how to do so for Amazon Relational Database Service (Amazon RDS) database. It uses a Network Load Balancer and traffic forwarding at the Linux-kernel level to proxy your actual database server.

Amazon S3 Intelligent-Tiering significantly reduces storage costs – we estimate our customers saved up to $250 millions in storage costs since we launched S3 Intelligent-Tiering in 2018. A recent blog post describes how Amazon Photo, a service that provides unlimited photo storage and 5 GB of video storage to Amazon Prime members in eight marketplaces world-wide, uses S3 Intelligent-Tiering to significantly save on storage costs while storing hundreds of petabytes of content and billions of images and videos on S3.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS re:Inforce is the premier cloud security conference, July 26-27. This year it is hosted at the Boston Convention and Exhibition Center, Massachusetts, USA. The conference agenda is available and there is still time to register.

AWS Summit Chicago, August 25, at McCormick Place, Chicago, Illinois, USA. You may register now.

AWS Summit Canberra, August 31, at the National Convention Center, Canberra, Australia. Registrations are already open.

That’s all for this week. Check back next Monday for another tour of AWS news and launches!

— seb

Data warehouse and business intelligence technology consolidation using AWS

Post Syndicated from Bappaditya Datta original https://aws.amazon.com/blogs/architecture/data-warehouse-and-business-intelligence-technology-consolidation-using-aws/

Organizations have been using data warehouse and business intelligence (DWBI) workloads to support business decision making for many years. These workloads are brought to the Amazon Web Services (AWS) platform to utilize the benefit of AWS cloud. However, these workloads are built using multiple vendor tools and technologies, and the customer faces the burden of administrative overhead.

This post provides architectural guidance to consolidate multiple DWBI technologies to AWS Managed Services to help reduce the administrative overhead, bring operational ease, and business efficiency. Two scenarios are explored:

  1. Upstream transactional databases are already on AWS
  2. Upstream transactional databases are present at on-premise datacenter

Challenges faced by an organization

Organizations are engaged in managing multiple DWBI technologies due to acquisitions, mergers, and the lift-and-shift of workloads. These workloads use extract, transform, and load (ETL) tools to read relational data from upstream transactional databases, process it, and store it in a data warehouse. Thereafter, these workloads use business intelligence tools to generate valuable insight and present it to users in form of reports and dashboards.

These DWBI technologies are generally installed and maintained on their own server. Figure 1 demonstrates the increased the administrative overhead for the organization but also creates challenges in maintaining the team’s overall knowledge.

DWBI workload with multiple tools

Figure 1. DWBI workload with multiple tools

Therefore, organizations are looking to consolidate technology usage and continue supporting important business functions.

Scenario 1

As we know, three major functions of DWBI workstream are:

  • ETL data using a tool
  • Store/manage the data in a data warehouse
  • Generate information from the data using business intelligence

Each of these functions can be performed efficiently using an AWS service. For example, AWS Glue can be used for ETL, Amazon Redshift for data warehouse, and Amazon QuickSight for business intelligence.

With the use of mentioned AWS services, organizations will be able to consolidate their DWBI technology usage. Organizations also will be able to quickly adapt to these services, as their engineering team can more easily use their DWBI knowledge with these services. For example, using SQL knowledge in AWS Glue jobs with SprakSQL, in Amazon Redshift queries, and in Amazon QuickSight dashboards.

Figure 2 demonstrates the redesigned the architecture of Figure 1 using AWS services. In this architecture, ETL functions are consolidated in AWS Glue. An AWS Glue crawler is used to auto-catalogue the source and target table metadata; then, AWS Glue ETL jobs use these catalogues to read data from source and write to target (data warehouse). AWS Glue jobs also apply necessary transformations (such as join, filter, and aggregate) to the data before writing. Additionally, an AWS Glue trigger is used to schedule the job executions. Alternatively, AWS Managed Workflows for Apache Airflow can be used to schedule jobs.

Consolidated workload with source on AWS

Figure 2. Consolidated workload with source on AWS

Similarly, data warehousing function is consolidated with Amazon Redshift. Amazon Redshift is used to store and organize enriched data and also enforce appropriate data access control for both workloads and users.

Lastly, business intelligence functions are consolidated using Amazon QuickSight. It used to create necessary dashboards that source data from Amazon Redshift and apply complex business logic to produce necessary charts and graphs needed for business insights. It is also used to implement necessary access restrictions to dashboards and data.

Scenario 2

In situation where source databases are in on-premises datacenter, the overall solution will be similar to Scenario 1, with an additional step to move the data continually from on-premise database to an Amazon Simple Storage Service (Amazon S3) bucket. The data movement can be efficiently handled by AWS Database Migration Service (AWS DMS).

To make the source database accessible to AWS DMS, a connection needs to established between the AWS cloud and on-premise network. Based on performance and throughput needs, the organization can choose either AWS Direct Connect service or AWS Site-to-Site VPN service to securely move the data. For the purpose of this discussion, we are considering AWS Direct Connect.

In Figure 3, AWS DMS task is used to perform a full-load followed by change data capture to continuously move the data to an S3 bucket. In this scenario, AWS Glue is used to catalogue and read the data from S3 bucket. The remaining portion of the dataflow is the same as the one mentioned in Scenario 1.

Consolidated workload with source at datacenter

Figure 3. Consolidated workload with source at datacenter

Scaling

Both of the updated architectures provide necessary scaling:

  • Auto scaling feature can be used to scale-up or -down AWS Glue ETL job resources
  • Concurrency scaling feature can be used to support virtually unlimited concurrent users and queries in Amazon Redshift
  • Amazon QuickSight resources (web server, Amazon QuickSight engine, and SPICE) are auto scaled by design

Security, monitoring, and auditing

Also, the updated architectures provide necessary security by using access control, data encryption at-rest and in transit, monitoring, and auditing.

Additionally, both Amazon Redshift and Amazon QuickSight provides their own authentication and access controls. Therefore, a user can be a local user or a federated one. With the help of these authentications, an organization will be able to control access to data in Amazon Redshift and also access to the dashboard in Amazon QuickSight.

Conclusion

In this blog post, we discussed how AWS Glue, Amazon Redshift, and Amazon QuickSight can be used to consolidate DWBI technologies. We also have discussed how an architecture can help an organization build a scalable, secure workload with auto scaling, access control, log monitoring and activity auditing.

Ready to get started?

AWS Week in Review – June 27, 2022

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-june-27-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

It’s the beginning of a new week, and I’d like to start with a recap of the most significant AWS news from the previous 7 days. Last week was special because I had the privilege to be at the very first EMEA AWS Heroes Summit in Milan, Italy. It was a great opportunity of mutual learning as this community of experts shared their thoughts with AWS developer advocates, product managers, and technologists on topics such as containers, serverless, and machine learning.

Participants at the EMEA AWS Heroes Summit 2022

Last Week’s Launches
Here are the launches that got my attention last week:

Amazon Connect Cases (available in preview) – This new capability of Amazon Connect provides built-in case management for your contact center agents to create, collaborate on, and resolve customer issues. Learn more in this blog post that shows how to simplify case management in your contact center.

Many updates for Amazon RDS and Amazon AuroraAmazon RDS Custom for Oracle now supports Oracle database 12.2 and 18c, and Amazon RDS Multi-AZ deployments with one primary and two readable standby database instances now supports M5d and R5d instances and is available in more Regions. There is also a Regional expansion for RDS Custom. Finally, PostgreSQL 14, a new major version, is now supported by Amazon Aurora PostgreSQL-Compatible Edition.

AWS WAF Captcha is now generally available – You can use AWS WAF Captcha to block unwanted bot traffic by requiring users to successfully complete challenges before their web requests are allowed to reach resources.

Private IP VPNs with AWS Site-to-Site VPN – You can now deploy AWS Site-to-Site VPN connections over AWS Direct Connect using private IP addresses. This way, you can encrypt traffic between on-premises networks and AWS via Direct Connect connections without the need for public IP addresses.

AWS Center for Quantum Networking – Research and development of quantum computers have the potential to revolutionize science and technology. To address fundamental scientific and engineering challenges and develop new hardware, software, and applications for quantum networks, we announced the AWS Center for Quantum Networking.

Simpler access to sustainability data, plus a global hackathon – The Amazon Sustainability Data Initiative catalog of datasets is now searchable and discoverable through AWS Data Exchange. As part of a new collaboration with the International Research Centre in Artificial Intelligence, under the auspices of UNESCO, you can use the power of the cloud to help the world become sustainable by participating to the Amazon Sustainability Data Initiative Global Hackathon.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
A couple of takeaways from the Amazon re:MARS conference:

Amazon CodeWhisperer (preview) – Amazon CodeWhisperer is a coding companion powered by machine learning with support for multiple IDEs and languages.

Synthetic data generation with Amazon SageMaker Ground TruthGenerate labeled synthetic image data that you can combine with real-world data to create more complete training datasets for your ML models.

Some other updates you might have missed:

AstraZeneca’s drug design program built using AWS wins innovation award – AstraZeneca received the BioIT World Innovative Practice Award at the 20th anniversary of the Bio-IT World Conference for its novel augmented drug design platform built on AWS. More in this blog post.

Large object storage strategies for Amazon DynamoDB – A blog post showing different options for handling large objects within DynamoDB and the benefits and disadvantages of each approach.

Amazon DevOps Guru for RDS under the hoodSome details of how DevOps Guru for RDS works, with a specific focus on its scalability, security, and availability.

AWS open-source news and updates – A newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

On June 30, the AWS User Group Ukraine is running an AWS Tech Conference to discuss digital transformation with AWS. Join to learn from many sessions including a fireside chat with Dr. Werner Vogels, CTO at Amazon.com.

That’s all from me for this week. Come back next Monday for another Week in Review!

Danilo

Continually assessing application resilience with AWS Resilience Hub and AWS CodePipeline

Post Syndicated from Scott Bryen original https://aws.amazon.com/blogs/architecture/continually-assessing-application-resilience-with-aws-resilience-hub-and-aws-codepipeline/

As customers commit to a DevOps mindset and embrace a nearly continuous integration/continuous delivery model to implement change with a higher velocity, assessing every change impact on an application resilience is key. This blog shows an architecture pattern for automating resiliency assessments as part of your CI/CD pipeline. Automatically running a resiliency assessment within CI/CD pipelines, development teams can fail fast and understand quickly if a change negatively impacts an applications resilience. The pipeline can stop the deployment into further environments, such as QA/UAT and Production, until the resilience issues have been improved.

AWS Resilience Hub is a managed service that gives you a central place to define, validate and track the resiliency of your AWS applications. It is integrated with AWS Fault Injection Simulator (FIS), a chaos engineering service, to provide fault-injection simulations of real-world failures. Using AWS Resilience Hub, you can assess your applications to uncover potential resilience enhancements. This will allow you to validate your applications recovery time (RTO), recovery point (RPO) objectives and optimize business continuity while reducing recovery costs. Resilience Hub also provides APIs for you to integrate its assessment and testing into your CI/CD pipelines for ongoing resilience validation.

AWS CodePipeline is a fully managed continuous delivery service for fast and reliable application and infrastructure updates. You can use AWS CodePipeline to model and automate your software release processes. This enables you to increase the speed and quality of your software updates by running all new changes through a consistent set of quality checks.

Continuous resilience assessments

Figure 1 shows the resilience assessments automation architecture in a multi-account setup. AWS CodePipeline, AWS Step Functions, and AWS Resilience Hub are defined in your deployment account while the application AWS CloudFormation stacks are imported from your workload account. This pattern relies on AWS Resilience Hub ability to import CloudFormation stacks from a different accounts, regions, or both, when discovering an application structure.

High-level architecture pattern for automating resilience assessments

Figure 1. High-level architecture pattern for automating resilience assessments

Add application to AWS Resilience Hub

Begin by adding your application to AWS Resilience Hub and assigning a resilience policy. This can be done via the AWS Management Console or using CloudFormation. In this instance, the application has been created through the AWS Management Console. Sebastien Stormacq’s post, Measure and Improve Your Application Resilience with AWS Resilience Hub, walks you through how to add your application to AWS Resilience Hub.

In a multi-account environment, customers typically have dedicated AWS workload account per environment and we recommend you separate CI/CD capabilities into another account. In this post, the AWS Resilience Hub application has been created in the deployment account and the resources have been discovered using an CloudFormation stack from the workload account. Proper permissions are required to use AWS Resilience Hub to manage application in multiple accounts.

Adding application to AWS Resilience Hub

Figure 2. Adding application to AWS Resilience Hub

Create AWS Step Function to run resilience assessment

Whenever you make a change to your application CloudFormation, you need to update and publish the latest version in AWS Resilience Hub to ensure you are assessing the latest changes. Now that AWS Step Functions SDK integrations support AWS Resilience Hub, you can build a state machine to coordinate the process, which will be triggered from AWS Code Pipeline.

AWS Step Functions is a low-code, visual workflow service that developers use to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services. Workflows manage failures, retries, parallelization, service integrations, and observability so developers can focus on higher-value business logic.

AWS Step Function for orchestrating AWS SDK calls

Figure 3. AWS Step Function for orchestrating AWS SDK calls

  1. The first step in the workflow is to update the resources associated with the application defined in AWS Resilience Hub by calling ImportResourcesToDraftApplication.
  2. Check for the import process to complete using a wait state, a call to DescribeDraftAppVersionResourcesImportStatus and then a choice state to decide whether to progress or continue waiting.
  3. Once complete, publish the draft application by calling PublishAppVersion to ensure we are assessing the latest version.
  4. Once published, call StartAppAssessment to kick-off a resilience assessment.
  5. Check for the assessment to complete using a wait state, a call to DescribeAppAssessment and then a choice state to decide whether to progress or continue waiting.
  6. In the choice state, use assessment status from the response to determine if the assessment is pending, in progress or successful.
  7. If successful, use the compliance status from the response to determine whether to progress to success or fail.
    • Compliance status will be either “PolicyMet” or “PolicyBreached”.
  8. If policy breached, publish onto SNS to alert the development team before moving to fail.

Create stage within code pipeline

Now that we have the AWS Step Function created, we need to integrate it into our pipeline. The post Fine-grained Continuous Delivery With CodePipeline and AWS Step Functions demonstrates how you can trigger a step function from AWS Code Pipeline.

When adding the stage, you need to pass the ARN of the stack which was deployed in the previous stage as well as the ARN of the application in AWS Resilience Hub. These will be required on the AWS SDK calls and you can pass this in as a literal.

AWS CodePipeline stage step function input

Figure 4. AWS CodePipeline stage step function input

Example state using the input from AWS CodePipeline stage

Figure 5. Example state using the input from AWS CodePipeline stage

For more information about these AWS SDK calls, please refer to the AWS Resilience Hub API Reference documents.

Customers often run their workloads in lower environments in a less resilient way to save on cost. It’s important to add the assessment stage at the appropriate point of your pipeline. We recommend adding this to your pipeline after the deployment to a test environment which mirrors production but before deploying to production. By doing this you can fail fast and halt changes which will lower resilience in production.

A note on service quotas: AWS Resilience Hub allows you to run 20 assessments per month per application. If you need to increase this quota, please raise a ticket with AWS Support.

Conclusion

In this post, we have seen an approach to continuously assessing resilience as part of your CI/CD pipeline using AWS Resilience Hub, AWS CodePipeline and AWS Step Functions. This approach will enable you to understand fast if a change will weaken resilience.

AWS Resilience Hub also generates recommended AWS FIS Experiments that you can deploy and use to test the resilience of your application. As well as assessing the resilience, we also recommend you integrate running these tests into your pipeline. The post Chaos Testing with AWS Fault Injection Simulator and AWS CodePipeline demonstrates how you can active this.

Optimize Federated Query Performance using EXPLAIN and EXPLAIN ANALYZE in Amazon Athena

Post Syndicated from Nishchai JM original https://aws.amazon.com/blogs/big-data/optimize-federated-query-performance-using-explain-and-explain-analyze-in-amazon-athena/

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (Amazon S3) using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. In 2019, Athena added support for federated queries to run SQL queries across data stored in relational, non-relational, object, and custom data sources.

In 2021, Athena added support for the EXPLAIN statement, which can help you understand and improve the efficiency of your queries. The EXPLAIN statement provides a detailed breakdown of a query’s run plan. You can analyze the plan to identify and reduce query complexity and improve its runtime. You can also use EXPLAIN to validate SQL syntax prior to running the query. Doing so helps prevent errors that would have occurred while running the query.

Athena also added EXPLAIN ANALYZE, which displays the computational cost of your queries alongside their run plans. Administrators can benefit from using EXPLAIN ANALYZE because it provides a scanned data count, which helps you reduce financial impact due to user queries and apply optimizations for better cost control.

In this post, we demonstrate how to use and interpret EXPLAIN and EXPLAIN ANALYZE statements to improve Athena query performance when querying multiple data sources.

Solution overview

To demonstrate using EXPLAIN and EXPLAIN ANALYZE statements, we use the following services and resources:

Athena uses the AWS Glue Data Catalog to store and retrieve table metadata for the Amazon S3 data in your AWS account. The table metadata lets the Athena query engine know how to find, read, and process the data that you want to query. We use Athena data source connectors to connect to data sources external to Amazon S3.

Prerequisites

To deploy the CloudFormation template, you must have the following:

Provision resources with AWS CloudFormation

To deploy the CloudFormation template, complete the following steps:

  1. Choose Launch Stack:

  1. Follow the prompts on the AWS CloudFormation console to create the stack.
  2. Note the key-value pairs on the stack’s Outputs tab.

You use these values when configuring the Athena data source connectors.

The CloudFormation template creates the following resources:

  • S3 buckets to store data and act as temporary spill buckets for Lambda
  • AWS Glue Data Catalog tables for the data in the S3 buckets
  • A DynamoDB table and Amazon RDS for MySQL tables, which are used to join multiple tables from different sources
  • A VPC, subnets, and endpoints, which are needed for Amazon RDS for MySQL and DynamoDB

The following figure shows the high-level data model for the data load.

Create the DynamoDB data source connector

To create the DynamoDB connector for Athena, complete the following steps:

  1. On the Athena console, choose Data sources in the navigation pane.
  2. Choose Create data source.
  3. For Data sources, select Amazon DynamoDB.
  4. Choose Next.

  1. For Data source name, enter DDB.

  1. For Lambda function, choose Create Lambda function.

This opens a new tab in your browser.

  1. For Application name, enter AthenaDynamoDBConnector.
  2. For SpillBucket, enter the value from the CloudFormation stack for AthenaSpillBucket.
  3. For AthenaCatalogName, enter dynamodb-lambda-func.
  4. Leave the remaining values at their defaults.
  5. Select I acknowledge that this app creates custom IAM roles and resource policies.
  6. Choose Deploy.

You’re returned to the Connect data sources section on the Athena console.

  1. Choose the refresh icon next to Lambda function.
  2. Choose the Lambda function you just created (dynamodb-lambda-func).

  1. Choose Next.
  2. Review the settings and choose Create data source.
  3. If you haven’t already set up the Athena query results location, choose View settings on the Athena query editor page.

  1. Choose Manage.
  2. For Location of query result, browse to the S3 bucket specified for the Athena spill bucket in the CloudFormation template.
  3. Add Athena-query to the S3 path.
  4. Choose Save.

  1. In the Athena query editor, for Data source, choose DDB.
  2. For Database, choose default.

You can now explore the schema for the sportseventinfo table; the data is the same in DynamoDB.

  1. Choose the options icon for the sportseventinfo table and choose Preview Table.

Create the Amazon RDS for MySQL data source connector

Now let’s create the connector for Amazon RDS for MySQL.

  1. On the Athena console, choose Data sources in the navigation pane.
  2. Choose Create data source.
  3. For Data sources, select MySQL.
  4. Choose Next.

  1. For Data source name, enter MySQL.

  1. For Lambda function, choose Create Lambda function.

  1. For Application name, enter AthenaMySQLConnector.
  2. For SecretNamePrefix, enter AthenaMySQLFederation.
  3. For SpillBucket, enter the value from the CloudFormation stack for AthenaSpillBucket.
  4. For DefaultConnectionString, enter the value from the CloudFormation stack for MySQLConnection.
  5. For LambdaFunctionName, enter mysql-lambda-func.
  6. For SecurityGroupIds, enter the value from the CloudFormation stack for RDSSecurityGroup.
  7. For SubnetIds, enter the value from the CloudFormation stack for RDSSubnets.
  8. Select I acknowledge that this app creates custom IAM roles and resource policies.
  9. Choose Deploy.

  1. On the Lambda console, open the function you created (mysql-lambda-func).
  2. On the Configuration tab, under Environment variables, choose Edit.

  1. Choose Add environment variable.
  2. Enter a new key-value pair:
    • For Key, enter MYSQL_connection_string.
    • For Value, enter the value from the CloudFormation stack for MySQLConnection.
  3. Choose Save.

  1. Return to the Connect data sources section on the Athena console.
  2. Choose the refresh icon next to Lambda function.
  3. Choose the Lambda function you created (mysql-lamdba-function).

  1. Choose Next.
  2. Review the settings and choose Create data source.
  3. In the Athena query editor, for Data Source, choose MYSQL.
  4. For Database, choose sportsdata.

  1. Choose the options icon by the tables and choose Preview Table to examine the data and schema.

In the following sections, we demonstrate different ways to optimize our queries.

Optimal join order using EXPLAIN plan

A join is a basic SQL operation to query data on multiple tables using relations on matching columns. Join operations affect how much data is read from a table, how much data is transferred to the intermediate stages through networks, and how much memory is needed to build up a hash table to facilitate a join.

If you have multiple join operations and these join tables aren’t in the correct order, you may experience performance issues. To demonstrate this, we use the following tables from difference sources and join them in a certain order. Then we observe the query runtime and improve performance by using the EXPLAIN feature from Athena, which provides some suggestions for optimizing the query.

The CloudFormation template you ran earlier loaded data into the following services:

AWS Storage Table Name Number of Rows
Amazon DynamoDB sportseventinfo 657
Amazon S3 person 7,025,585
Amazon S3 ticketinfo 2,488

Let’s construct a query to find all those who participated in the event by type of tickets. The query runtime with the following join took approximately 7 mins to complete:

SELECT t.id AS ticket_id, 
e.eventid, 
p.first_name 
FROM 
"DDB"."default"."sportseventinfo" e, 
"AwsDataCatalog"."athenablog"."person" p, 
"AwsDataCatalog"."athenablog"."ticketinfo" t 
WHERE 
t.sporting_event_id = cast(e.eventid as double) 
AND t.ticketholder_id = p.id

Now let’s use EXPLAIN on the query to see its run plan. We use the same query as before, but add explain (TYPE DISTRIBUTED):

EXPLAIN (TYPE DISTRIBUTED)
SELECT t.id AS ticket_id, 
e.eventid, 
p.first_name 
FROM 
"DDB"."default"."sportseventinfo" e, 
"AwsDataCatalog"."athenablog"."person" p, 
"AwsDataCatalog"."athenablog"."ticketinfo" t 
WHERE 
t.sporting_event_id = cast(e.eventid as double) 
AND t.ticketholder_id = p.id

The following screenshot shows our output

Notice the cross-join in Fragment 1. The joins are converted to a Cartesian product for each table, where every record in a table is compared to every record in another table. Therefore, this query takes a significant amount of time to complete.

To optimize our query, we can rewrite it by reordering the joining tables as sportseventinfo first, ticketinfo second, and person last. The reason for this is because the WHERE clause, which is being converted to a JOIN ON clause during the query plan stage, doesn’t have the join relationship between the person table and sportseventinfo table. Therefore, the query plan generator converted the join type to cross-joins (a Cartesian product), which less efficient. Reordering the tables aligns the WHERE clause to the INNER JOIN type, which satisfies the JOIN ON clause and runtime is reduced from 7 minutes to 10 seconds.

The code for our optimized query is as follows:

SELECT t.id AS ticket_id, 
e.eventid, 
p.first_name 
FROM 
"DDB"."default"."sportseventinfo" e, 
"AwsDataCatalog"."athenablog"."ticketinfo" t, 
"AwsDataCatalog"."athenablog"."person" p 
WHERE 
t.sporting_event_id = cast(e.eventid as double) 
AND t.ticketholder_id = p.id

The following is the EXPLAIN output of our query after reordering the join clause:

EXPLAIN (TYPE DISTRIBUTED) 
SELECT t.id AS ticket_id, 
e.eventid, 
p.first_name 
FROM 
"DDB"."default"."sportseventinfo" e, 
"AwsDataCatalog"."athenablog"."ticketinfo" t, 
"AwsDataCatalog"."athenablog"."person" p 
WHERE t.sporting_event_id = cast(e.eventid as double) 
AND t.ticketholder_id = p.id

The following screenshot shows our output.

The cross-join changed to INNER JOIN with join on columns (eventid, id, ticketholder_id), which results in the query running faster. Joins between the ticketinfo and person tables converted to the PARTITION distribution type, where both left and right tables are hash-partitioned across all worker nodes due to the size of the person table. The join between the sportseventinfo table and ticketinfo are converted to the REPLICATED distribution type, where one table is hash-partitioned across all worker nodes and the other table is replicated to all worker nodes to perform the join operation.

For more information about how to analyze these results, refer to Understanding Athena EXPLAIN statement results.

As a best practice, we recommend having a JOIN statement along with an ON clause, as shown in the following code:

SELECT t.id AS ticket_id, 
e.eventid, 
p.first_name 
FROM 
"AwsDataCatalog"."athenablog"."person" p 
JOIN "AwsDataCatalog"."athenablog"."ticketinfo" t ON t.ticketholder_id = p.id 
JOIN "ddb"."default"."sportseventinfo" e ON t.sporting_event_id = cast(e.eventid as double)

Also as a best practice when you join two tables, specify the larger table on the left side of join and the smaller table on the right side of the join. Athena distributes the table on the right to worker nodes, and then streams the table on the left to do the join. If the table on the right is smaller, then less memory is used and the query runs faster.

In the following sections, we present examples of how to optimize pushdowns for filter predicates and projection filter operations for the Athena data source using EXPLAIN ANALYZE.

Pushdown optimization for the Athena connector for Amazon RDS for MySQL

A pushdown is an optimization to improve the performance of a SQL query by moving its processing as close to the data as possible. Pushdowns can drastically reduce SQL statement processing time by filtering data before transferring it over the network and filtering data before loading it into memory. The Athena connector for Amazon RDS for MySQL supports pushdowns for filter predicates and projection pushdowns.

The following table summarizes the services and tables we use to demonstrate a pushdown using Aurora MySQL.

Table Name Number of Rows Size in KB
player_partitioned 5,157 318.86
sport_team_partitioned 62 5.32

We use the following query as an example of a filtering predicate and projection filter:

SELECT full_name,
name 
FROM "sportsdata"."player_partitioned" a 
JOIN "sportsdata"."sport_team_partitioned" b ON a.sport_team_id=b.id 
WHERE a.id='1.0'

This query selects the players and their team based on their ID. It serves as an example of both filter operations in the WHERE clause and projection because it selects only two columns.

We use EXPLAIN ANALYZE to get the cost for the running this query:

EXPLAIN ANALYZE 
SELECT full_name,
name 
FROM "MYSQL"."sportsdata"."player_partitioned" a 
JOIN "MYSQL"."sportsdata"."sport_team_partitioned" b ON a.sport_team_id=b.id 
WHERE a.id='1.0'

The following screenshot shows the output in Fragment 2 for the table player_partitioned, in which we observe that the connector has a successful pushdown filter on the source side, so it tries to scan only one record out of the 5,157 records in the table. The output also shows that the query scan has only two columns (full_name as the projection column and sport_team_id and the join column), and uses SELECT and JOIN, which indicates the projection pushdown is successful. This helps reduce the data scan when using Athena data source connectors.

Now let’s look at the conditions in which a filter predicate pushdown doesn’t work with Athena connectors.

LIKE statement in filter predicates

We start with the following example query to demonstrate using the LIKE statement in filter predicates:

SELECT * 
FROM "MYSQL"."sportsdata"."player_partitioned" 
WHERE first_name LIKE '%Aar%'

We then add EXPLAIN ANALYZE:

EXPLAIN ANALYZE 
SELECT * 
FROM "MYSQL"."sportsdata"."player_partitioned" 
WHERE first_name LIKE '%Aar%'

The EXPLAIN ANALYZE output shows that the query performs the table scan (scanning the table player_partitioned, which contains 5,157 records) for all the records even though the WHERE clause only has 30 records matching the condition %Aar%. Therefore, the data scan shows the complete table size even with the WHERE clause.

We can optimize the same query by selecting only the required columns:

EXPLAIN ANALYZE 
SELECT sport_team_id,
full_name 
FROM "MYSQL"."sportsdata"."player_partitioned" 
WHERE first_name LIKE '%Aar%'

From the EXPLAIN ANALYZE output, we can observe that the connector supports the projection filter pushdown, because we select only two columns. This brought the data scan size down to half of the table size.

OR statement in filter predicates

We start with the following query to demonstrate using the OR statement in filter predicates:

SELECT id,
first_name 
FROM "MYSQL"."sportsdata"."player_partitioned" 
WHERE first_name = 'Aaron' OR id ='1.0'

We use EXPLAIN ANALYZE with the preceding query as follows:

EXPLAIN ANALYZE 
SELECT * 
FROM 
"MYSQL"."sportsdata"."player_partitioned" 
WHERE first_name = 'Aaron' OR id ='1.0'

Similar to the LIKE statement, the following output shows that query scanned the table instead of pushing down to only the records that matched the WHERE clause. This query outputs only 16 records, but the data scan indicates a complete scan.

Pushdown optimization for the Athena connector for DynamoDB

For our example using the DynamoDB connector, we use the following data:

Table Number of Rows Size in KB
sportseventinfo 657 85.75

Let’s test the filter predicate and project filter operation for our DynamoDB table using the following query. This query tries to get all the events and sports for a given location. We use EXPLAIN ANALYZE for the query as follows:

EXPLAIN ANALYZE 
SELECT EventId,
Sport 
FROM "DDB"."default"."sportseventinfo" 
WHERE Location = 'Chase Field'

The output of EXPLAIN ANALYZE shows that the filter predicate retrieved only 21 records, and the project filter selected only two columns to push down to the source. Therefore, the data scan for this query is less than the table size.

Now let’s see where filter predicate pushdown doesn’t work. In the WHERE clause, if you apply the TRIM() function to the Location column and then filter, predicate pushdown optimization doesn’t apply, but we still see the projection filter optimization, which does apply. See the following code:

EXPLAIN ANALYZE 
SELECT EventId,
Sport 
FROM "DDB"."default"."sportseventinfo" 
WHERE trim(Location) = 'Chase Field'

The output of EXPLAIN ANALYZE for this query shows that the query scans all the rows but is still limited to only two columns, which shows that the filter predicate doesn’t work when the TRIM function is applied.

We’ve seen from the preceding examples that the Athena data source connector for Amazon RDS for MySQL and DynamoDB do support filter predicates and projection predicates for pushdown optimization, but we also saw that operations such as LIKE, OR, and TRIM when used in the filter predicate don’t support pushdowns to the source. Therefore, if you encounter unexplained charges in your federated Athena query, we recommend using EXPLAIN ANALYZE with the query and determine whether your Athena connector supports the pushdown operation or not.

Please note that running EXPLAIN ANALYZE incurs cost because it scans the data.

Conclusion

In this post, we showcased how to use EXPLAIN and EXPLAIN ANALYZE to analyze Athena SQL queries for data sources on AWS S3 and Athena federated SQL query for data source like DynamoDB and Amazon RDS for MySQL. You can use this as an example to optimize queries which would also result in cost savings.


About the Authors

Nishchai JM is an Analytics Specialist Solutions Architect at Amazon Web services. He specializes in building Big-data applications and help customer to modernize their applications on Cloud. He thinks Data is new oil and spends most of his time in deriving insights out of the Data.

Varad Ram is Senior Solutions Architect in Amazon Web Services. He likes to help customers adopt to cloud technologies and is particularly interested in artificial intelligence. He believes deep learning will power future technology growth. In his spare time, he like to be outdoor with his daughter and son.

Considerations for modernizing Microsoft SQL database service with high availability on AWS

Post Syndicated from Lewis Tang original https://aws.amazon.com/blogs/architecture/considerations-for-modernizing-microsoft-sql-database-service-with-high-availability-on-aws/

Many organizations have applications that require Microsoft SQL Server to run relational database workloads: some applications can be proprietary software that the vendor mandates Microsoft SQL Server to run database service; the other applications can be long-standing, home-grown applications that included Microsoft SQL Server when they were initially developed. When organizations migrate applications to AWS, they often start with lift-and-shift approach and run Microsoft SQL database service on Amazon Elastic Compute Cloud (Amazon EC2). The reason could be this is what they are most familiar with.

In this post, I share the architecture options to modernize Microsoft SQL database service and run highly available relational data services on Amazon EC2, Amazon Relational Database Service (Amazon RDS), and Amazon Aurora (Aurora).

Running Microsoft SQL database service on Amazon EC2 with high availability

This option is the least invasive to existing operations models. It gives you a quick start to modernize Microsoft SQL database service by leveraging the AWS Cloud to manage services like physical facilities. The low-level infrastructure operational tasks—such as server rack, stack, and maintenance—are managed by AWS. You have full control of the database and operating-system–level access, so there is a choice of tools to manage the operating system, database software, patches, data replication, backup, and restoration.

You can use any Microsoft SQL Server-supported replication technology with your Microsoft SQL Server database on Amazon EC2 to achieve high availability, data protection, and disaster recovery. Common solutions include log shipping, database mirroring, Always On availability groups, and Always On Failover Cluster Instances.

High availability in a single Region

Figure 1 shows how you can use Microsoft SQL Server on Amazon EC2 across multiple Availability Zones (AZs) within single Region. The interconnects among AZs that are similar to your data center intercommunications are managed by AWS. The primary database is a read-write database, and the secondary database is configured with log shipping, database mirroring, or Always On availability groups for high availability. All the transactional data from the primary database is transferred and can be applied to the secondary database asynchronously for log shipping, and it can either asynchronously or synchronously for Always On availability groups and mirroring.

High availability in a single Region with Microsoft SQL Database Service on Amazon EC2

Figure 1. High availability in a single Region with Microsoft SQL database service on Amazon EC2

High availability across multiple Regions

Figure 2 demonstrates how to configure high availability for Microsoft SQL Server on Amazon EC2 across multiple Regions. A secondary Microsoft SQL Server in a different Region from the primary is configured with log shipping, database mirroring, or Always On availability groups for high availability. The transactional data from primary database is transferred via the fully managed backbone network of AWS across Regions.

High availability across multiple Regions with Microsoft SQL database service on Amazon EC2

Figure 2. High availability across multiple Regions with Microsoft SQL database service on Amazon EC2

Replatforming Microsoft SQL Database Service on Amazon RDS with high availability

Amazon RDS is a managed database service and responsible for most management tasks. It currently supports Multi-AZ deployments for SQL Server using SQL Server Database Mirroring (DBM) or Always On Availability Groups (AGs) as a high-availability, failover solution.

High availability in a single Region

Figure 3 demonstrates the Microsoft SQL database service that is run on Amazon RDS is configured with a multi-AZ deployment model in single region. Multi-AZ deployments provide increased availability, data durability, and fault tolerance for DB instances. In the event of planned database maintenance or unplanned service disruption, Amazon RDS automatically fails-over to the up-to-date secondary DB instance. This functionality lets database operations resume quickly without manual intervention. The primary and standby instances use the same endpoint, whose physical network address transitions to the secondary replica as part of the failover process. You don’t have to reconfigure your application when a failover occurs. Amazon RDS supports multi-AZ deployments for Microsoft SQL Server by using either SQL Server database mirroring or Always On availability groups.

High availability in a single Region with Microsoft SQL database service on Amazon RDS

Figure 3. High availability in a single Region with Microsoft SQL database service on Amazon RDS

High availability across multiple Regions

Figure 4 depicts how you can use AWS Database Migration Service (AWS DMS) to configure continuous replication among Microsoft SQL Database Service on Amazon RDS across multiple Regions. AWS DMS needs Microsoft Change Data Capture to be enabled on the Amazon RDS for the Microsoft SQL Server instance. If problems occur, you can initiate manual failovers and reinstate database services by promoting the Amazon RDS read replica in a different Region.

High availability across multiple Regions with Microsoft SQL database service on Amazon RDS

Figure 4. High availability across multiple Regions with Microsoft SQL database service on Amazon RDS

Refactoring Microsoft SQL database service on Amazon Aurora with high availability

This option helps you to eliminate the cost of SQL database service license. You can run database service on a truly cloud native modern database architecture. You can use AWS Schema Conversion Tool to assist in the assessment and conversion of your database code and storage objects. Any objects that cannot be automatically converted are clearly marked so they can be manually converted to complete the migration.

The Aurora architecture involves separation of storage and compute. Aurora includes some high availability features that apply to the data in your database cluster. The data remains safe even if some or all of the DB instances in the cluster become unavailable. Other high availability features apply to the DB instances. These features help to make sure that one or more DB instances are ready to handle database requests from your application.

High availability in a single Region

Figure 5 demonstrates Aurora stores copies of the data in a database cluster across multiple AZs in single Region. When data is written to the primary DB instance, Aurora synchronously replicates the data across AZs to six storage nodes associated with your cluster volume. Doing so provides data redundancy, eliminates I/O freezes, and minimizes latency spikes during system backups. Running a DB instance with high availability can enhance availability during planned system maintenance, such as database engine updates, and help protect your databases against failure and AZ disruption.

High availability in a single Region with Amazon Aurora

Figure 5. High availability in a single Region with Amazon Aurora

High availability across multiple Regions

Figure 6 depicts how you can set up Aurora global databases for high availability across multiple Regions. An Aurora global database consists of one primary Region where your data is written, and up to five read-only secondary Regions. You issue write operations directly to the primary database cluster in the primary Region. Aurora automatically replicates data to the secondary Regions using dedicated infrastructure, with latency typically under a second.

High availability across multiple Regions with Amazon Aurora global databases

Figure 6. High availability across multiple Regions with Amazon Aurora global databases

Summary

You can choose among the options of Amazon EC2, Amazon RDS, and Amazon Aurora when modernizing SQL database service on AWS. Understanding the features required by business and the scope of service management responsibilities are good starting points. When presented with multiple options that meet with business needs, choose one that will allow more focus on your application, business value-add capabilities, and help you to reduce the services’ “total cost of ownership”.

Running hybrid Active Directory service with AWS Managed Microsoft Active Directory

Post Syndicated from Lewis Tang original https://aws.amazon.com/blogs/architecture/running-hybrid-active-directory-service-with-aws-managed-microsoft-active-directory/

Enterprise customers often need to architect a hybrid Active Directory solution to support running applications in the existing on-premises corporate data centers and AWS cloud. There are many reasons for this, such as maintaining the integration with on-premises legacy applications, keeping the control of infrastructure resources, and meeting with specific industry compliance requirements.

To extend on-premises Active Directory environments to AWS, some customers choose to deploy Active Directory service on self-managed Amazon Elastic Compute Cloud (EC2) instances after setting up connectivity for both environments. This setup works fine, but it also presents management and operations challenges when it comes to EC2 instance operation management, Windows operating system, and Active Directory service patching and backup. This is where AWS Directory Service for Microsoft Active Directory (AWS Managed Microsoft AD) helps.

Benefits of using AWS Managed Microsoft AD

With AWS Managed Microsoft AD, you can launch an AWS-managed directory in the cloud, leveraging the scalability and high availability of an enterprise directory service while adding seamless integration into other AWS services.

In addition, you can still access AWS Managed Microsoft AD using existing administrative tools and techniques, such as delegating administrative permissions to select groups in your organization. The full list of permissions that can be delegated is described in the AWS Directory Service Administration Guide.

Active Directory service design consideration with a single AWS account

Single region

A single AWS account is where the journey begins: a simple use case might be when you need to deploy a new solution in the cloud from scratch (Figure 1).

A single AWS account and single-region model

Figure 1. A single AWS account and single-region model

In a single AWS account and single-region model, the on-premises Active Directory has “company.com” domain configured in the on-premises data center. AWS Managed Microsoft AD is set up across two availability zones in the AWS region for high availability. It has a single domain, “na.company.com”, configured. The on-premises Active Directory is configured to trust the AWS Managed Microsoft AD with network connectivity via AWS Direct Connect or VPN. Applications that are Active-Directory–aware and run on EC2 instances have joined na.company.com domain, as do the selected AWS managed services (for example, Amazon Relational Database Service for SQL server).

Multi-region

As your cloud footprint expands to more AWS regions, you have two options also to expand AWS Managed Microsoft AD, depending on which edition of AWS Managed Microsoft AD is used (Figure 2):

  1. With AWS Managed Microsoft AD Enterprise Edition, you can turn on the multi-region replication feature to configure automatically inter-regional networking connectivity, deploy domain controllers, and replicate all the Active Directory data across multiple regions. This ensures that Active-Directory–aware workloads residing in those regions can connect to and use AWS Managed Microsoft AD with low latency and high performance.
  2. With AWS Managed Microsoft AD Standard Edition, you will need to add a domain by creating independent AWS Managed Microsoft AD directories per-region. In Figure 2, “eu.company.com” domain is added, and AWS Transit Gateway routes traffic among Active-Directory–aware applications within two AWS regions. The on-premises Active Directory is configured to trust the AWS Managed Microsoft AD, either by Direct Connect or VPN.
A single AWS account and multi-region model

Figure 2. A single AWS account and multi-region model

Active Directory Service Design consideration with multiple AWS accounts

Large organizations use multiple AWS accounts for administrative delegation and billing purposes. This is commonly implemented through AWS Control Tower service or AWS Control Tower landing zone solution.

Single region

You can share a single AWS Managed Microsoft AD with multiple AWS accounts within one AWS region. This capability makes it simpler and more cost-effective to manage Active-Directory–aware workloads from a single directory across accounts and Amazon Virtual Private Cloud (VPC). This option also allows you seamlessly join your EC2 instances for Windows to AWS Managed Microsoft AD.

As a best practice, place AWS Managed Microsoft AD in a separate AWS account, with limited administrator access but sharing the service with other AWS accounts. After sharing the service and configuring routing, Active Directory aware applications, such as Microsoft SharePoint, can seamlessly join Active Directory Domain Services and maintain control of all administrative tasks. Find more details on sharing AWS Managed Microsoft AD in the Share your AWS Managed AD directory tutorial.

Multi-region

With multiple AWS Accounts and multiple–AWS-regions model, we recommend using AWS Managed Microsoft AD Enterprise Edition. In Figure 3, AWS Managed Microsoft AD Enterprise Edition supports automating multi-region replication in all AWS regions where AWS Managed Microsoft AD is available. In AWS Managed Microsoft AD multi-region replication, Active-Directory–aware applications use the local directory for high performance but remain multi-region for high resiliency.

Multiple AWS accounts and multi-region model

Figure 3. Multiple AWS accounts and multi-region model

Domain Name System resolution design

To enable Active-Directory–aware applications communicate between your on-premises data centers and the AWS cloud, a reliable solution for Domain Name System (DNS) resolution is needed. You can set the Amazon VPC Dynamic Host Configuration Protocol (DHCP) option sets to either AWS Managed Microsoft AD or on-premises Active Directory; then, assign it to each VPC in which the required Active-Directory–aware applications reside. The full list of options working with DHCP option sets is described in Amazon Virtual Private Cloud User Guide.

The benefit of configuring DHCP option sets is to allow any EC2 instances in that VPC to resolve their domain names by pointing to the specified domain and DNS servers. This prevents the need for manual configuration of DNS on EC2 instances. However, because DHCP option sets cannot be shared across AWS accounts, this requires a DHCP option sets also to be created in additional accounts.

DHCP option sets

Figure 4. DHCP option sets

An alternative option is creating an Amazon Route 53 Resolver. This allows customers to leverage Amazon-provided DNS and Route 53 Resolver endpoints to forward a DNS query to the on-premises Active Directory or AWS Managed Microsoft AD. This is ideal for multi-account setups and customers desiring hub/spoke DNS management.

This alternative solution replaces the need to create and manage EC2 instances running as DNS forwarders with a managed and scalable solution, as Route 53 Resolver forwarding rules can be shared with other AWS accounts. Figure 5 demonstrates a Route 53 resolver forwarding a DNS query to on-premises Active Directory.

Route 53 Resolver

Figure 5. Route 53 Resolver

Conclusion

In this post, we described the benefits of using AWS Managed Microsoft AD to integrate with on-premises Active Directory. We also discussed a range of design considerations to explore when architecting hybrid Active Directory service with AWS Managed Microsoft AD. Different design scenarios were reviewed, from a single AWS account and region, to multiple AWS accounts and multi-regions. We have also discussed choosing between the Amazon VPC DHCP option sets and Route 53 Resolver for DNS resolution.

Further reading

AWS Week in Review – May 9, 2022

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-may-9-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Another week starts, and here’s a collection of the most significant AWS news from the previous seven days. This week is also the one-year anniversary of CloudFront Functions. It’s exciting to see what customers have built during this first year.

Last Week’s Launches
Here are some launches that caught my attention last week:

Amazon RDS supports PostgreSQL 14 with three levels of cascaded read replicas – That’s 5 replicas per instance, supporting a maximum of 155 read replicas per source instance with up to 30X more read capacity. You can now build a more robust disaster recovery architecture with the capability to create Single-AZ or Multi-AZ cascaded read replica DB instances in same or cross Region.

Amazon RDS on AWS Outposts storage auto scalingAWS Outposts extends AWS infrastructure, services, APIs, and tools to virtually any datacenter. With Amazon RDS on AWS Outposts, you can deploy managed DB instances in your on-premises environments. Now, you can turn on storage auto scaling when you create or modify DB instances by selecting a checkbox and specifying the maximum database storage size.

Amazon CodeGuru Reviewer suppression of files and folders in code reviews – With CodeGuru Reviewer, you can use automated reasoning and machine learning to detect potential code defects that are difficult to find and get suggestions for improvements. Now, you can prevent CodeGuru Reviewer from generating unwanted findings on certain files like test files, autogenerated files, or files that have not been recently updated.

Amazon EKS console now supports all standard Kubernetes resources to simplify cluster management – To make it easy to visualize and troubleshoot your applications, you can now use the console to see all standard Kubernetes API resource types (such as service resources, configuration and storage resources, authorization resources, policy resources, and more) running on your Amazon EKS cluster. More info in the blog post Introducing Kubernetes Resource View in Amazon EKS console.

AWS AppConfig feature flag Lambda Extension support for Arm/Graviton2 processors – Using AWS AppConfig, you can create feature flags or other dynamic configuration and safely deploy updates. The AWS AppConfig Lambda Extension allows you to access this feature flag and dynamic configuration data in your Lambda functions. You can now use the AWS AppConfig Lambda Extension from Lambda functions using the Arm/Graviton2 architecture.

AWS Serverless Application Model (SAM) CLI now supports enabling AWS X-Ray tracing – With the AWS SAM CLI you can initialize, build, package, test on local and cloud, and deploy serverless applications. With AWS X-Ray, you have an end-to-end view of requests as they travel through your application, making them easier to monitor and troubleshoot. Now, you can enable tracing by simply adding a flag to the sam init command.

Amazon Kinesis Video Streams image extraction – With Amazon Kinesis Video Streams you can capture, process, and store media streams. Now, you can also request images via API calls or configure automatic image generation based on metadata tags in ingested video. For example, you can use this to generate thumbnails for playback applications or to have more data for your machine learning pipelines.

AWS GameKit supports Android, iOS, and MacOS games developed with Unreal Engine – With AWS GameKit, you can build AWS-powered game features directly from the Unreal Editor with just a few clicks. Now, the AWS GameKit plugin for Unreal Engine supports building games for the Win64, MacOS, Android, and iOS platforms.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates you might have missed:

🎂 One-year anniversary of CloudFront Functions – I can’t believe it’s been one year since we launched CloudFront Functions. Now, we have tens of thousands of developers actively using CloudFront Functions, with trillions of invocations per month. You can use CloudFront Functions for HTTP header manipulation, URL rewrites and redirects, cache key manipulations/normalization, access authorization, and more. See some examples in this repo. Let’s see what customers built with CloudFront Functions:

  • CloudFront Functions enables Formula 1 to authenticate users with more than 500K requests per second. The solution is using CloudFront Functions to evaluate if users have access to view the race livestream by validating a token in the request.
  • Cloudinary is a media management company that helps its customers deliver content such as videos and images to users worldwide. For them, [email protected] remains an excellent solution for applications that require heavy compute operations, but lightweight operations that require high scalability can now be run using CloudFront Functions. With CloudFront Functions, Cloudinary and its customers are seeing significantly increased performance. For example, one of Cloudinary’s customers began using CloudFront Functions, and in about two weeks it was seeing 20–30 percent better response times. The customer also estimates that they will see 75 percent cost savings.
  • Based in Japan, DigitalCube is a web hosting provider for WordPress websites. Previously, DigitalCube spent several hours completing each of its update deployments. Now, they can deploy updates across thousands of distributions quickly. Using CloudFront Functions, they’ve reduced update deployment times from 4 hours to 2 minutes. In addition, faster updates and less maintenance work result in better quality throughout DigitalCube’s offerings. It’s now easier for them to test on AWS because they can run tests that affect thousands of distributions without having to scale internally or introduce downtime.
  • Amazon.com is using CloudFront Functions to change the way it delivers static assets to customers globally. CloudFront Functions allows them to experiment with hyper-personalization at scale and optimal latency performance. They have been working closely with the CloudFront team during product development, and they like how it is easy to create, test, and deploy custom code and implement business logic at the edge.

AWS open-source news and updates – A newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more. Read the latest edition here.

Reduce log-storage costs by automating retention settings in Amazon CloudWatch – By default, CloudWatch Logs stores your log data indefinitely. This blog post shows how you can reduce log-storage costs by establishing a log-retention policy and applying it across all of your log groups.

Observability for AWS App Runner VPC networking – With X-Ray support in App runner, you can quickly deploy web applications and APIs at any scale and take advantage of adding tracing without having to manage sidecars or agents. Here’s an example of how you can instrument your applications with the AWS Distro for OpenTelemetry (ADOT).

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

You can now register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.

That’s all from me for this week. Come back next Monday for another Week in Review!

Danilo

AWS Week in Review – May 2, 2022

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/aws-week-in-review-may-2-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Wow, May already! Here in the Pacific Northwest, spring is in full bloom and nature has emerged completely from her winter slumbers. It feels that way here at AWS, too, with a burst of new releases and updates and our in-person summits and other events now in full flow. Two weeks ago, we had the San Francisco summit; last week, we held the London summit and also our .NET Enterprise Developer Day virtual event in EMEA. This week we have the Madrid summit, with more summits and events to come in the weeks ahead. Be sure to check the events section at the end of this post for a summary and registration links.

Last week’s launches
Here are some of the launches and updates last week that caught my eye:

If you’re looking to reduce or eliminate the operational overhead of managing your Apache Kafka clusters, then the general availability of Amazon Managed Streaming for Apache Kafka (MSK) Serverless will be of interest. Starting with the original release of Amazon MSK in 2019, the work needed to set up, scale, and manage Apache Kafka has been reduced, requiring just minutes to create a cluster. With Amazon MSK Serverless, the provisioning, scaling, and management of the required resources is automated, eliminating the undifferentiated heavy-lift. As my colleague Marcia notes in her blog post, Amazon MSK Serverless is a perfect solution when getting started with a new Apache Kafka workload where you don’t know how much capacity you will need or your applications produce unpredictable or highly variable throughput and you don’t want to pay for idle capacity.

Another week, another set of Amazon Elastic Compute Cloud (Amazon EC2) instances! This time around, it’s new storage-optimized I4i instances based on the latest generation Intel Xeon Scalable (Ice Lake) Processors. These new instances are ideal for workloads that need minimal latency, and fast access to data held on local storage. Examples of these workloads include transactional databases such as MySQL, Oracle DB, and Microsoft SQL Server, as well as NoSQL databases including MongoDB, Couchbase, Aerospike, and Redis. Additionally, workloads that benefit from very high compute performance per TB of storage (for example, data analytics and search engines) are also an ideal target for these instance types, which offer up to 30 TB of AWS Nitro SSD storage.

Deploying AWS compute and storage services within telecommunications providers’ data centers, at the edge of the 5G networks, opens up interesting new possibilities for applications requiring end-to-end low latency (for example, delivery of high-resolution and high-fidelity live video streaming, and improved augmented/virtual reality (AR/VR) experiences). The first AWS Wavelength deployments started in the US in 2020, and have expanded to additional countries since. This week we announced the opening of the first Canadian AWS Wavelength zone, in Toronto.

Other AWS News
Some other launches and news items you may have missed:

Amazon Relational Database Service (RDS) had a busy week. I don’t have room to list them all, so below is just a subset of updates!

  • The addition of IPv6 support enables customers to simplify their networking stack. The increase in address space offered by IPv6 removes the need to manage overlapping address spaces in your Amazon Virtual Private Cloud (VPC)s. IPv6 addressing can be enabled on both new and existing RDS instances.
  • Customers in the Asia Pacific (Sydney) and Asia Pacific (Singapore) Regions now have the option to use Multi-AZ deployments to provide enhanced availability and durability for Amazon RDS DB instances, offering one primary and two readable standby database instances spanning three Availability Zones (AZs). These deployments benefit from up to 2x faster transaction commit latency, and automated fail overs, typically under 35 seconds.
  • Amazon RDS PostgreSQL users can now choose from General-Purpose M6i and Memory-Optimized R6i instance types. Both of these sixth-generation instance types are AWS Nitro System-based, delivering practically all of the compute and memory resources of the host hardware to your instances.
  • Applications using RDS Data API can now elect to receive SQL results as a simplified JSON string, making it easier to deserialize results to an object. Previously, the API returned a JSON string as an array of data type and value pairs, which required developers to write custom code to parse the response and extract the values, so as to translate the JSON string into an object. Applications that use the API to receive the previous JSON format are still supported and will continue to work unchanged.

Applications using Amazon Interactive Video Service (IVS), offering low-latency interactive video experiences, can now add a livestream chat feature, complete with built-in moderation, to help foster community participation in livestreams using Q&A discussions. The new chat support provides chat room resource management and a messaging API for sending, receiving, and moderating chat messages.

Amazon Polly now offers a new Neural Text-to-Speech (TTS) voice, Vitória, for Brazilian Portuguese. The original Vitória voice, dating back to 2016, used standard technology. The new voice offers a more natural-sounding rhythm, intonation, and sound articulation. In addition to Vitória, Polly also offers a second Brazilian Portuguese neural voice, Camila.

Finally, if you’re a .NET developer who’s modernizing .NET Framework applications to run in the cloud, then the announcement that the open-source CoreWCF project has reached its 1.0 release milestone may be of interest. AWS is a major contributor to the project, a port of Windows Communication Foundation (WCF), to run on modern cross-platform .NET versions (.NET Core 3.1, or .NET 5 or higher). This project benefits all .NET developers working on WCF applications, not just those on AWS. You can read more about the project in my blog post from last year, where I spoke with one of the contributing AWS developers. Congratulations to all concerned on reaching the 1.0 milestone!

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Upcoming AWS Events
As I mentioned earlier, the AWS Summits are in full flow, with some some virtual and in-person events in the very near future you may want to check out:

I’m also happy to share that I’ll be joining the AWS on Air crew at AWS Summit Washington, DC. This in-person event is coming up May 23–25. Be sure to tune in to the livestream for all the latest news from the event, and if you’re there in person feel free to come say hi!

Registration is also now open for re:MARS, our conference for topics related to machine learning, automation, robotics, and space. The conference will be in-person in Las Vegas, June 21–24.

That’s all the news I have room for this week — check back next Monday for another week in review!

— Steve

Amazon Aurora Serverless v2 is Generally Available: Instant Scaling for Demanding Workloads

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/amazon-aurora-serverless-v2-is-generally-available-instant-scaling-for-demanding-workloads/

Today we are very excited to announce that Amazon Aurora Serverless v2 is generally available for both Aurora PostgreSQL and MySQL. Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora that allows your database to scale capacity up or down based on your application’s needs.

Amazon Aurora is a MySQL- and PostgreSQL-compatible relational database built for the cloud. It is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administrative tasks, such as hardware provisioning, database setup, patches, and backups.

One of the key features of Amazon Aurora is the separation of compute and storage. As a result, they scale independently. Amazon Aurora storage automatically scales as the amount of data in your database increases. For example, you can store lots of data, and if one day you decide to drop most of the data, the storage provisioned adjusts.

How Amazon Aurora works - compute and storage separation
However, many customers said that they need the same flexibility in the compute layer of Amazon Aurora since most database workloads don’t need a constant amount of compute. Workloads can be spiky, infrequent, or have predictable spikes over a period of time.

To serve these kinds of workloads, you need to provision for the peak capacity you expect your database will need. However, this approach is expensive as database workloads rarely run at peak capacity. To provision the right amount of compute, you need to continuously monitor the database capacity consumption and scale up resources if consumption is high. However, this requires expertise and often incurs downtime.

To solve this problem, in 2018, we launched the first version of Amazon Aurora Serverless. Since its launch, thousands of customers have used Amazon Aurora Serverless as a cost-effective option for infrequent, intermittent, and unpredictable workloads.

Today, we are making the next version of Amazon Aurora Serverless generally available, which enables customers to run even the most demanding workload on serverless with instant and nondisruptive scaling, fine-grained capacity adjustments, and additional functionality, including read replicas, Multi-AZ deployments, and Amazon Aurora Global Database.

Aurora Serverless v2 is launching with the latest major versions available on Amazon Aurora. Versions supported: Aurora PostgreSQL-compatible edition with PostgreSQL 13 and Aurora MySQL-compatible edition with MySQL 8.0.

Main features of Aurora Serverless v2
Aurora Serverless v2 enables you to scale your database to hundreds of thousands of transactions per second and cost-effectively manage the most demanding workloads. It scales database capacity in fine-grained increments to closely match the needs of your workload without disrupting connections or transactions. In addition, you pay only for the exact capacity you consume, and you can save up to 90 percent compared to provisioning for peak load.

If you have an existing Amazon Aurora cluster, you can create an Aurora Serverless v2 instance within the same cluster. This way, you’ll have a mixed configuration cluster where both provisioned and Aurora Serverless v2 instances can coexist within the same cluster.

It supports the full breadth of Amazon Aurora features. For example, you can create up to 15 Amazon Aurora read replicas deployed across multiple Availability Zones. Any number of these read replicas can be Aurora Serverless v2 instances and can be used as failover targets for high availability or for scaling read operations.

Similarly, with Global Database, you can assign any of the instances to be Aurora Serverless v2 and only pay for minimum capacity when idling. These instances in secondary Regions can also scale independently to support varying workloads across different Regions. Check out the Amazon Aurora user guide for a comprehensive list of features.

Aurora Serverless compute and storage scaling

How Aurora Serverless v2 scaling works
Aurora Serverless v2 scales instantly and nondisruptively by growing the capacity of the underlying instance in place by adding more CPU and memory resources. This technique allows for the underlying instance to increase and decrease capacity in place without failing over to a new instance for scaling.

For scaling down, Aurora Serverless v2 takes a more conservative approach. It scales down in steps until it reaches the required capacity needed for the workload. Scaling down too quickly can prematurely evict cached pages and decrease the buffer pool, which may affect the performance.

Aurora Serverless capacity is measured in Aurora capacity units (ACUs). Each ACU is a combination of approximately 2 gibibytes (GiB) of memory, corresponding CPU, and networking. With Aurora Serverless v2, your starting capacity can be as small as 0.5 ACU, and the maximum capacity supported is 128 ACU. In addition, it supports fine-grained increments as small as 0.5 ACU which allows your database capacity to closely match the workload needs.

Aurora Serverless v2 scaling in action
To show Aurora Serverless v2 in action, we are going to simulate a flash sale. Imagine that you run an e-commerce site. You run a marketing campaign where customers can purchase items 50 percent off for a limited amount of time. You are expecting a spike in traffic on your site for the duration of the sale.

When you use a traditional database, if you run those marketing campaigns regularly, you need to provision for the peak load you expect. Or, if you run them now and then, you need to reconfigure your database for the expected peak of traffic during the sale. In both cases, you are limited to your assumption of the capacity you need. What happens if you have more sales than you expected? If your database cannot keep up with the demand, it may cause service degradation. Or when your marketing campaign doesn’t produce the sales you expected? You are unnecessarily paying for capacity you don’t need.

For this demo, we use Aurora Serverless v2 as the transactional database. An AWS Lambda function is used to call the database and process orders during the sale event for the e-commerce site. The Lambda function and the database are in the same Amazon Virtual Private Cloud (VPC), and the function connects directly to the database to perform all the operations.

To simulate the traffic of a flash sale, we will use an open-source load testing framework called Artillery. It will allow us to generate varying load by invoking multiple Lambda functions. For example, we can start with a small load and then increase it rapidly to observe how the database capacity adjusts based on the workload. This Artillery load test runs on an Amazon Elastic Compute Cloud (Amazon EC2) instance inside the same VPC.

Architecture diagram
The following Amazon CloudWatch dashboard shows how the database capacity behaves when the order count increases. The dashboard shows the orders placed in blue and the current database capacity in orange.

At the beginning of the sale, the Aurora Serverless v2 database starts with a capacity of 5 ACUs, which was the minimum database capacity configured. For the first few minutes, the orders increase, but the database capacity doesn’t increase right away. The database can handle the load with the starting provisioned capacity.

However, around the time 15:55, the number of orders spikes to 12,000. As a result, the database increases the capacity to 14 ACUs. The database capacity increases in milliseconds, adjusting exactly to the load.

The number of orders placed stays up for some seconds, and then it goes dramatically down by 15:58. However, the database capacity doesn’t adjust exactly to the drop in traffic. Instead, it decreases in steps until it reaches 5 ACUs. The scaling down is done more conservatively to avoid prematurely evicting cached pages and affecting performance. This is done to prevent any unnecessary latency to spiky workloads, and also so the caches and buffer pools are not aggressively purged.

Cloudwatch dashboard

Get started with Aurora Serverless v2 with an existing Amazon Aurora cluster
If you already have an Amazon Aurora cluster and you want to try Aurora Serverless v2, the fastest way to get started is by using mixed configuration clusters that contain both serverless and provisioned instances. Start by adding a new reader into the existing cluster. Configure the reader instance to be of the type Serverless v2.

Adding a serverless reader

Test the new serverless instance with your workload. Once you have confirmation that it works as expected, you can start a failover to the serverless instance, which will take less than 30 seconds to finish. This option provides a minimal downtime experience to get started with Aurora Serverless v2.

Failover to the serverless instance

How to create a new Aurora Serverless v2 database
To get started with Aurora Serverless v2, create a new database from the RDS console. The first step is to pick the engine type: Amazon Aurora. Then, pick which database engine you want it to be compatible with: MySQL or PostgreSQL. Open the filters under Engine version and select the filter Show versions that support Serverless v2. Then, you see that the Available versions dropdown list only shows options that are supported by Aurora Serverless v2.

Engine options
Next, you need to set up the database. Specify credential settings with a username and password for the administrator of the database.

Database settings
Then, configure the instance for the database. You need to select what kind of instance class you want. This allocates the computational, network, and memory capacity for the database instance. Select Serverless.

Then, you need to define the capacity range. Aurora Serverless v2 capacity scales up and down within the minimum and maximum configuration. Here you can specify the minimum and maximum database capacity for your workload. The minimum capacity you can specify is 0.5 ACUs, and the maximum is 128 ACUs. For more information on Aurora Serverless v2 capacity units, see the Instant autoscaling documentation.

Capacity configuration
Next, configure connectivity by creating a new VPC and security group or use the default. Finally, select Create database.

Connectivity configuration

Creating the database takes a couple of minutes. You know your database is ready when the status switches to Available.

Database list

You will find the connection details for the database on the database page. The endpoint and the port, combined with the user name and password for the administrator, are all you need to connect to your new Aurora Serverless v2 database.

Database details page

Available Now!
Aurora Serverless v2 is available now in US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), and South America (São Paulo).

Visit the Amazon Aurora Serverless v2 page for more information about this launch.

Marcia

New Amazon RDS for MySQL & PostgreSQL Multi-AZ Deployment Option: Improved Write Performance & Faster Failover

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-rds-multi-az-db-cluster/

Today, we are announcing a new Amazon Relational Database Service (RDS) Multi-AZ deployment option with up to 2x faster transaction commit latency, automated failovers typically under 35 seconds, and readable standby instances.

Amazon RDS offers two replication options to enhance availability and performance:

  • Multi-AZ deployments gives high availability and automatic failover. Amazon RDS creates a storage-level replica of the database in a second Availability Zone. It then synchronously replicates data from the primary to the standby DB instance for high availability. The primary DB instance serves application requests, while the standby DB instance remains ready to take over in case of a failure. Amazon RDS manages all aspects of failure detection, failover, and repair actions so the applications using the database can be highly available.
  • Read replicas allow applications to scale their read operations across multiple database instances. The database engine replicates data asynchronously to the read replicas. The application sends the write requests (INSERT, UPDATE, and DELETE) to the primary database, and read requests (SELECT) can be load balanced across read replicas. In case of failure of the primary node, you can manually promote a read replica to become the new primary database.

Multi-AZ deployments and read replicas serve different purposes. Multi-AZ deployments give your application high availability, durability, and automatic failover. Read replicas give your applications read scalability.

But what about applications that require both high availability with automatic failover and read scalability?

Introducing the New Amazon RDS Multi-AZ Deployment Option With Two Readable Standby Instances.
Starting today, we’re adding a new option to deploy RDS databases. This option combines automatic failover and read replicas: Amazon RDS Multi-AZ with two readable standby instances. This deployment option is available for MySQL and PostgreSQL databases. This is a database cluster with one primary and two readable standby instances. It provides up to 2x faster transaction commit latency and automated failovers, typically under 35 seconds.

The following diagram illustrates such a deployment:

Three AZ RDS databases

When the new Multi-AZ DB cluster deployment option is enabled, RDS configures a primary database and two read replicas in three distinct Availability Zones. It then monitors and enables failover in case of failure of the primary node.

Just like with traditional read replicas, the database engine replicates data between the primary node and the read replicas. And just like with the Multi-AZ one standby deployment option, RDS automatically detects and manages failover for high availability.

You do not have to choose between high availability or scalability; Multi-AZ DB cluster with two readable standby enables both.

What Are the Benefits?
This new deployment option offers you four benefits over traditional multi-AZ deployments: improved commit latency, faster failover, readable standby instances, and optimized replications.

First, write operations are faster when using Multi-AZ DB cluster. The new Multi-AZ DB cluster instances leverage M6gd and R6gd instance types. These instances are powered by AWS Graviton2 processors. They are equipped with fast NVMe SSD for local storage, ideal for high speed and low-latency storage. They deliver up to 40 percent better price performance and 50 percent more local storage GB per vCPU over comparable x86-based instances.

Multi-AZ DB instances use Amazon Elastic Block Store (EBS) to store the data and the transaction log. The new Multi-AZ DB cluster instances use local storage provided by the instances to store the transaction log. Local storage is optimized to deliver low-latency, high I/O operations per second (IOPS) to applications. Write operations are first written to the local storage transaction log, then flushed to permanent storage on database storage volumes.

Second, failover operations are typically faster than in the Multi-AZ DB instance scenario. The read replicas created by the new Multi-AZ DB cluster are full-fledged database instances. The system is designed to fail over as quickly as 35 seconds, plus the time to apply any pending transaction log. In case of failover, the system is fully automated to promote a new primary and reconfigure the old primary as a new reader instance.

Third, the two standby instances are hot standbys. Your applications may use the cluster reader endpoint to send their read requests (SELECT) to these standby instances. It allows your application to spread the database read load equally between the instances of the database cluster.

And finally, leveraging local storage for transaction log optimizes replication. The existing Multi-AZ DB instance replicates all changes at storage-level. The new Multi-AZ DB cluster replicates only the transaction log and uses a quorum mechanism to confirm at least one standby acknowledged the change. Database transactions are committed synchronously when one of the secondary instances confirms the transaction log is written on its local disk.

Migrating Existing Databases
For those of you having existing RDS databases and willing to take advantage of this new Multi-AZ DB cluster deployment option, you may take a snapshot of your database to create a storage-level backup of your existing database instance. Once the snapshot is ready, you can create a new database cluster, with Multi-AZ DB cluster deployment option, based on this snapshot. Your new Multi-AZ DB cluster will be a perfect copy of your existing database.

Let’s See It in Action
To get started, I point my browser to the AWS Management Console and navigate to RDS. The Multi-AZ DB cluster deployment option is available for MySQL version 8.0.28 or later and PostgreSQL version 13.4 R1 and 13.5 R1. I select either database engine, and I ensure the version matches the minimum requirements. The rest of the procedure is the same as a standard Amazon RDS database launch.

Under Deployment options, I select PostgreSQL, version 13.4 R1, and under Availability and Durability, I select Multi-AZ DB cluster.

Three AZ RDS launch console

If required, I may choose the set of Availability Zones RDS uses for the cluster. To do so, I create a DB subnet group and assign the cluster to this subnet group.

Once launched, I verify that three DB instances have been created. I also take note of the two endpoints provided by Amazon RDS: the primary endpoint and one load-balanced endpoint for the two readable standby instances.

RDS Three AZ list of instances

To test the new cluster, I create an Amazon Linux 2 EC2 instance in the same VPC, within the same security group as the database, and I make sure I attach an IAM role containing the AmazonSSMManagedInstanceCore managed policy. This allows me to connect to the instance using SSM instead of SSH.

Once the instance is started, I use SSM to connect to the instance. I install PostgreSQL client tools.

sudo amazon-linux-extras enable postgresql13
sudo yum clean metadata
sudo yum install postgresql

I connect to the primary DB. I create a table and INSERT a record.

psql -h awsnewsblog.cluster-c1234567890r.us-east-1.rds.amazonaws.com -U postgres

postgres=> create table awsnewsblogdemo (id int primary key, name varchar);
CREATE TABLE

postgres=> insert into awsnewsblogdemo (id,name) values (1, 'seb');
INSERT 0 1

postgres=> exit

To verify the replication works as expected, I connect to the read-only replica. Notice the -ro- in the endpoint name. I check the table structure and enter a SELECT statement to confirm the data have been replicated.

psql -h awsnewsblog.cluster-ro-c1234567890r.us-east-1.rds.amazonaws.com -U postgres

postgres=> \dt

              List of relations
 Schema |      Name       | Type  |  Owner
--------+-----------------+-------+----------
 public | awsnewsblogdemo | table | postgres
(1 row)

postgres=> select * from awsnewsblogdemo;
 id | name
----+------
  1 | seb
(1 row)

postgres=> exit

In the scenario of a failover, the application will be disconnected from the primary database instance. In that case, it is important that your application-level code try to reestablish network connection. After a short period of time, the DNS name of the endpoint will point to the standby instance, and your application will be able to reconnect.

To learn more about Multi-AZ DB clusters, you can refer to our documentation.

Pricing and Availability
Amazon RDS Multi-AZ deployments with two readable standbys is generally available in the following Regions: US East (N. Virginia), US West (Oregon), and Europe (Ireland). We will add more regions to this list.

You can use it with MySQL version 8.0.28 or later, or PostgreSQL version 13.4 R1 or 13.5 R1.

Pricing depends on the instance type. In US regions, on-demand pricing starts at $0.522 per hour for M6gd instances and $0.722 per hour for R6gd instances. As usual, the Amazon RDS pricing page has the details for MySQL and PostgreSQL.

You can start to use it today.

How ENGIE scales their data ingestion pipelines using Amazon MWAA

Post Syndicated from Anouar Zaaber original https://aws.amazon.com/blogs/big-data/how-engie-scales-their-data-ingestion-pipelines-using-amazon-mwaa/

ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated lots of data, and it required a smarter, unique approach and solution to align its initiatives and provide data that is ingestible, organizable, governable, sharable, and actionable across its global business units.

In 2018, the company’s business leadership decided to accelerate its digital transformation through data and innovation by becoming a data-driven company. Yves Le Gélard, chief digital officer at ENGIE, explains the company’s purpose: “Sustainability for ENGIE is the alpha and the omega of everything. This is our raison d’être. We help large corporations and the biggest cities on earth in their attempts to transition to zero carbon as quickly as possible because it is actually the number one question for humanity today.”

ENGIE, as with any other big enterprise, is using multiple extract, transform, and load (ETL) tools to ingest data into their data lake on AWS. Nevertheless, they usually have expensive licensing plans. “The company needed a uniform method of collecting and analyzing data to help customers manage their value chains,” says Gregory Wolowiec, the Chief Technology Officer who leads ENGIE’s data program. ENGIE wanted a free-license application, well integrated with multiple technologies and with a continuous integration, continuous delivery (CI/CD) pipeline to more easily scale all their ingestion process.

ENGIE started using Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to solve this issue and started moving various data sources from on-premise applications and ERPs, AWS services like Amazon Redshift, Amazon Relational Database Service (Amazon RDS), Amazon DynamoDB, external services like Salesforce, and other cloud providers to a centralized data lake on top of Amazon Simple Storage Service (Amazon S3).

Amazon MWAA is used in particular to collect and store harmonized operational and corporate data from different on-premises and software as a service (SaaS) data sources into a centralized data lake. The purpose of this data lake is to create a “group performance cockpit” that enables an efficient, data-driven analysis and thoughtful decision-making by the Engie Management board.

In this post, we share how ENGIE created a CI/CD pipeline for an Amazon MWAA project template using an AWS CodeCommit repository and plugged it into AWS CodePipeline to build, test, and package the code and custom plugins. In this use case, we developed a custom plugin to ingest data from Salesforce based on the Airflow Salesforce open-source plugin.

Solution overview

The following diagrams illustrate the solution architecture defining the implemented Amazon MWAA environment and its associated pipelines. It also describes the customer use case for Salesforce data ingestion into Amazon S3.

The following diagram shows the architecture of the deployed Amazon MWAA environment and the implemented pipelines.

The preceding architecture is fully deployed via infrastructure as code (IaC). The implementation includes the following:

  • Amazon MWAA environment – A customizable Amazon MWAA environment packaged with plugins and requirements and configured in a secure manner.
  • Provisioning pipeline – The admin team can manage the Amazon MWAA environment using the included CI/CD provisioning pipeline. This pipeline includes a CodeCommit repository plugged into CodePipeline to continuously update the environment and its plugins and requirements.
  • Project pipeline – This CI/CD pipeline comes with a CodeCommit repository that triggers CodePipeline to continuously build, test and deploy DAGs developed by users. Once deployed, these DAGs are made available in the Amazon MWAA environment.

The following diagram shows the data ingestion workflow, which includes the following steps:

  1. The DAG is triggered by Amazon MWAA manually or based on a schedule.
  2. Amazon MWAA initiates data collection parameters and calculates batches.
  3. Amazon MWAA distributes processing tasks among its workers.
  4. Data is retrieved from Salesforce in batches.
  5. Amazon MWAA assumes an AWS Identity and Access Management (IAM) role with the necessary permissions to store the collected data into the target S3 bucket.

This AWS Cloud Development Kit (AWS CDK) construct is implemented with the following security best practices:

  • With the principle of least privilege, you grant permissions to only the resources or actions that users need to perform tasks.
  • S3 buckets are deployed with security compliance rules: encryption, versioning, and blocking public access.
  • Authentication and authorization management is handled with AWS Single Sign-On (AWS SSO).
  • Airflow stores connections to external sources in a secure manner either in Airflow’s default secrets backend or an alternative secrets backend such as AWS Secrets Manager or AWS Systems Manager Parameter Store.

For this post, we step through a use case using the data from Salesforce to ingest it into an ENGIE data lake in order to transform it and build business reports.

Prerequisites for deployment

For this walkthrough, the following are prerequisites:

  • Basic knowledge of the Linux operating system
  • Access to an AWS account with administrator or power user (or equivalent) IAM role policies attached
  • Access to a shell environment or optionally with AWS CloudShell

Deploy the solution

To deploy and run the solution, complete the following steps:

  1. Install AWS CDK.
  2. Bootstrap your AWS account.
  3. Define your AWS CDK environment variables.
  4. Deploy the stack.

Install AWS CDK

The described solution is fully deployed with AWS CDK.

AWS CDK is an open-source software development framework to model and provision your cloud application resources using familiar programming languages. If you want to familiarize yourself with AWS CDK, the AWS CDK Workshop is a great place to start.

Install AWS CDK using the following commands:

npm install -g aws-cdk
# To check the installation
cdk --version

Bootstrap your AWS account

First, you need to make sure the environment where you’re planning to deploy the solution to has been bootstrapped. You only need to do this one time per environment where you want to deploy AWS CDK applications. If you’re unsure whether your environment has been bootstrapped already, you can always run the command again:

cdk bootstrap aws://YOUR_ACCOUNT_ID/YOUR_REGION

Define your AWS CDK environment variables

On Linux or MacOS, define your environment variables with the following code:

export CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID
export CDK_DEFAULT_REGION=YOUR_REGION

On Windows, use the following code:

setx CDK_DEFAULT_ACCOUNT YOUR_ACCOUNT_ID
setx CDK_DEFAULT_REGION YOUR_REGION

Deploy the stack

By default, the stack deploys a basic Amazon MWAA environment with the associated pipelines described previously. It creates a new VPC in order to host the Amazon MWAA resources.

The stack can be customized using the parameters listed in the following table.

To pass a parameter to the construct, you can use the AWS CDK runtime context. If you intend to customize your environment with multiple parameters, we recommend using the cdk.json context file with version control to avoid unexpected changes to your deployments. Throughout our example, we pass only one parameter to the construct. Therefore, for the simplicity of the tutorial, we use the the --context or -c option to the cdk command, as in the following example:

cdk deploy -c paramName=paramValue -c paramName=paramValue ...
Parameter Description Default Valid values
vpcId VPC ID where the cluster is deployed. If none, creates a new one and needs the parameter cidr in that case. None VPC ID
cidr The CIDR for the VPC that is created to host Amazon MWAA resources. Used only if the vpcId is not defined. 172.31.0.0/16 IP CIDR
subnetIds Comma-separated list of subnets IDs where the cluster is deployed. If none, looks for private subnets in the same Availability Zone. None Subnet IDs list (coma separated)
envName Amazon MWAA environment name MwaaEnvironment String
envTags Amazon MWAA environment tags None See the following JSON example: '{"Environment":"MyEnv", "Application":"MyApp", "Reason":"Airflow"}'
environmentClass Amazon MWAA environment class mw1.small mw1.small, mw1.medium, mw1.large
maxWorkers Amazon MWAA maximum workers 1 int
webserverAccessMode Amazon MWAA environment access mode (private or public) PUBLIC_ONLY PUBLIC_ONLY, PRIVATE_ONLY
secretsBackend Amazon MWAA environment secrets backend Airflow SecretsManager

Clone the GitHub repository:

git clone https://github.com/aws-samples/cdk-amazon-mwaa-cicd

Deploy the stack using the following command:

cd mwaairflow && \
pip install . && \
cdk synth && \
cdk deploy -c vpcId=YOUR_VPC_ID

The following screenshot shows the stack deployment:

The following screenshot shows the deployed stack:

Create solution resources

For this walkthrough, you should have the following prerequisites:

If you don’t have a Salesforce account, you can create a SalesForce developer account:

  1. Sign up for a developer account.
  2. Copy the host from the email that you receive.
  3. Log in into your new Salesforce account
  4. Choose the profile icon, then Settings.
  5. Choose Reset my Security Token.
  6. Check your email and copy the security token that you receive.

After you complete these prerequisites, you’re ready to create the following resources:

  • An S3 bucket for Salesforce output data
  • An IAM role and IAM policy to write the Salesforce output data on Amazon S3
  • A Salesforce connection on the Airflow UI to be able to read from Salesforce
  • An AWS connection on the Airflow UI to be able to write on Amazon S3
  • An Airflow variable on the Airflow UI to store the name of the target S3 bucket

Create an S3 bucket for Salesforce output data

To create an output S3 bucket, complete the following steps:

  1. On the Amazon S3 console, choose Create bucket.

The Create bucket wizard opens.

  1. For Bucket name, enter a DNS-compliant name for your bucket, such as airflow-blog-post.
  2. For Region, choose the Region where you deployed your Amazon MWAA environment, for example, US East (N. Virginia) us-east-1.
  3. Choose Create bucket.

For more information, see Creating a bucket.

Create an IAM role and IAM policy to write the Salesforce output data on Amazon S3

In this step, we create an IAM policy that allows Amazon MWAA to write on your S3 bucket.

  1. On the IAM console, in the navigation pane, choose Policies.
  2. Choose Create policy.
  3. Choose the JSON tab.
  4. Enter the following JSON policy document, and replace airflow-blog-post with your bucket name:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": ["s3:ListBucket"],
          "Resource": ["arn:aws:s3:::airflow-blog-post"]
        },
        {
          "Effect": "Allow",
          "Action": [
            "s3:PutObject",
            "s3:GetObject",
            "s3:DeleteObject"
          ],
          "Resource": ["arn:aws:s3:::airflow-blog-post/*"]
        }
      ]
    }

  5. Choose Next: Tags.
  6. Choose Next: Review.
  7. For Name, choose a name for your policy (for example, airflow_data_output_policy).
  8. Choose Create policy.

Let’s attach the IAM policy to a new IAM role that we use in our Airflow connections.

  1. On the IAM console, choose Roles in the navigation pane and then choose Create role.
  2. In the Or select a service to view its use cases section, choose S3.
  3. For Select your use case, choose S3.
  4. Search for the name of the IAM policy that we created in the previous step (airflow_data_output_role) and select the policy.
  5. Choose Next: Tags.
  6. Choose Next: Review.
  7. For Role name, choose a name for your role (airflow_data_output_role).
  8. Review the role and then choose Create role.

You’re redirected to the Roles section.

  1. In the search box, enter the name of the role that you created and choose it.
  2. Copy the role ARN to use later to create the AWS connection on Airflow.

Create a Salesforce connection on the Airflow UI to be able to read from Salesforce

To read data from Salesforce, we need to create a connection using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Connections, and then the plus sign to create a new connection.
  3. Fill in the fields with the required information.

The following table provides more information about each value.

Field Mandatory Description Values
Conn Id Yes Connection ID to define and to be used later in the DAG For example, salesforce_connection
Conn Type Yes Connection type HTTP
Host Yes Salesforce host name host-dev-ed.my.salesforce.com or host.lightning.force.com. Replace the host with your Salesforce host and don’t add the http:// as prefix.
Login Yes The Salesforce user name. The user must have read access to the salesforce objects. [email protected]
Password Yes The corresponding password for the defined user. MyPassword123
Port No Salesforce instance port. By default, 443. 443
Extra Yes Specify the extra parameters (as a JSON dictionary) that can be used in the Salesforce connection. security_token is the Salesforce security token for authentication. To get the Salesforce security token in your email, you must reset your security token. {"security_token":"AbCdE..."}

Create an AWS connection in the Airflow UI to be able to write on Amazon S3

An AWS connection is required to upload data into Amazon S3, so we need to create a connection using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Connections, and then choose the plus sign to create a new connection.
  3. Fill in the fields with the required information.

The following table provides more information about the fields.

Field Mandatory Description Value
Conn Id Yes Connection ID to define and to be used later in the DAG For example, aws_connection
Conn Type Yes Connection type Amazon Web Services
Extra Yes It is required to specify the Region. You also need to provide the role ARN that we created earlier.
{
"region":"eu-west-1",
"role_arn":"arn:aws:iam::123456789101:role/airflow_data_output_role "
}

Create an Airflow variable on the Airflow UI to store the name of the target S3 bucket

We create a variable to set the name of the target S3 bucket. This variable is used by the DAG. So, we need to create a variable using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Variables, then choose the plus sign to create a new variable.
  3. For Key, enter bucket_name.
  4. For Val, enter the name of the S3 bucket that you created in a previous step (airflow-blog-post).

Create and deploy a DAG in Amazon MWAA

To be able to ingest data from Salesforce into Amazon S3, we need to create a DAG (Directed Acyclic Graph). To create and deploy the DAG, complete the following steps:

  1. Create a local Python DAG.
  2. Deploy your DAG using the project CI/CD pipeline.
  3. Run your DAG on the Airflow UI.
  4. Display your data in Amazon S3 (with S3 Select).

Create a local Python DAG

The provided SalesForceToS3Operator allows you to ingest data from Salesforce objects to an S3 bucket. Refer to standard Salesforce objects for the full list of objects you can ingest data from with this Airflow operator.

In this use case, we ingest data from the Opportunity Salesforce object. We retrieve the last 6 months’ data in monthly batches and we filter on a specific list of fields.

The DAG provided in the sample in GitHub repository imports the last 6 months of the Opportunity object (one file by month) by filtering the list of retrieved fields.

This operator takes two connections as parameters:

  • An AWS connection that is used to upload ingested data into Amazon S3.
  • A Salesforce connection to read data from Salesforce.

The following table provides more information about the parameters.

Parameter Type Mandatory Description
sf_conn_id string Yes Name of the Airflow connection that has the following information:

  • user name
  • password
  • security token
sf_obj string Yes Name of the relevant Salesforce object (Account, Lead, Opportunity)
s3_conn_id string Yes The destination S3 connection ID
s3_bucket string Yes The destination S3 bucket
s3_key string Yes The destination S3 key
sf_fields string No The (optional) list of fields that you want to get from the object (Id, Name, and so on).
If none (the default), then this gets all fields for the object.
fmt string No The (optional) format that the S3 key of the data should be in.
Possible values include CSV (default), JSON, and NDJSON.
from_date date format No A specific date-time (optional) formatted input to run queries from for incremental ingestion.
Evaluated against the SystemModStamp attribute.
Not compatible with the query parameter and should be in date-time format (for example, 2021-01-01T00:00:00Z).
Default: None
to_date date format No A specific date-time (optional) formatted input to run queries to for incremental ingestion.
Evaluated against the SystemModStamp attribute.
Not compatible with the query parameter and should be in date-time format (for example, 2021-01-01T00:00:00Z).
Default: None
query string No A specific query (optional) to run for the given object.
This overrides default query creation.
Default: None
relationship_object string No Some queries require relationship objects to work, and these are not the same names as the Salesforce object.
Specify that relationship object here (optional).
Default: None
record_time_added boolean No Set this optional value to true if you want to add a Unix timestamp field to the resulting data that marks when the data was fetched from Salesforce.
Default: False
coerce_to_timestamp boolean No Set this optional value to true if you want to convert all fields with dates and datetimes into Unix timestamp (UTC).
Default: False

The first step is to import the operator in your DAG:

from operators.salesforce_to_s3_operator import SalesforceToS3Operator

Then define your DAG default ARGs, which you can use for your common task parameters:

# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args = {
    'owner': '[email protected]',
    'depends_on_past': False,
    'start_date': days_ago(2),
    'retries': 0,
    'retry_delay': timedelta(minutes=1),
    'sf_conn_id': 'salesforce_connection',
    's3_conn_id': 'aws_connection',
    's3_bucket': 'salesforce-to-s3',
}
...

Finally, you define the tasks to use the operator.

The following examples illustrate some use cases.

Salesforce object full ingestion

This task ingests all the content of the Salesforce object defined in sf_obj. This selects all the object’s available fields and writes them into the defined format in fmt. See the following code:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

Salesforce object partial ingestion based on fields

This task ingests specific fields of the Salesforce object defined in sf_obj. The selected fields are defined in the optional sf_fields parameter. See the following code:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    sf_fields=["Id","Name","Amount"],
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

Salesforce object partial ingestion based on time period

This task ingests all the fields of the Salesforce object defined in sf_obj. The time period can be relative using from_date or to_date parameters or absolute by using both parameters.

The following example illustrates relative ingestion from the defined date:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    from_date="YESTERDAY",
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

The from_date and to_date parameters support Salesforce date-time format. It can be either a specific date or literal (for example TODAY, LAST_WEEK, LAST_N_DAYS:5). For more information about date formats, see Date Formats and Date Literals.

For the full DAG, refer to the sample in GitHub repository.

This code dynamically generates tasks that run queries to retrieve the data of the Opportunity object in the form of 1-month batches.

The sf_fields parameter allows us to extract only the selected fields from the object.

Save the DAG locally as salesforce_to_s3.py.

Deploy your DAG using the project CI/CD pipeline

As part of the CDK deployment, a CodeCommit repository and CodePipeline pipeline were created in order to continuously build, test, and deploy DAGs into your Amazon MWAA environment.

To deploy the new DAG, the source code should be committed to the CodeCommit repository. This triggers a CodePipeline run that builds, tests, and deploys your new DAG and makes it available in your Amazon MWAA environment.

  1. Sign in to the CodeCommit console in your deployment Region.
  2. Under Source, choose Repositories.

You should see a new repository mwaaproject.

  1. Push your new DAG in the mwaaproject repository under dags. You can either use the CodeCommit console or the Git command line to do so:
    1. CodeCommit console:
      1. Choose the project CodeCommit repository name mwaaproject and navigate under dags.
      2. Choose Add file and then Upload file and upload your new DAG.
    2. Git command line:
      1. To be able to clone and access your CodeCommit project with the Git command line, make sure Git client is properly configured. Refer to Setting up for AWS CodeCommit.
      2. Clone the repository with the following command after replacing <region> with your project Region:
        git clone https://git-codecommit.<region>.amazonaws.com/v1/repos/mwaaproject

      3. Copy the DAG file under dags and add it with the command:
        git add dags/salesforce_to_s3.py

      4. Commit your new file with a message:
        git commit -m "add salesforce DAG"

      5. Push the local file to the CodeCommit repository:
        git push

The new commit triggers a new pipeline that builds, tests, and deploys the new DAG. You can monitor the pipeline on the CodePipeline console.

  1. On the CodePipeline console, choose Pipeline in the navigation pane.
  2. On the Pipelines page, you should see mwaaproject-pipeline.
  3. Choose the pipeline to display its details.

After checking that the pipeline run is successful, you can verify that the DAG is deployed to the S3 bucket and therefore available on the Amazon MWAA console.

  1. On the Amazon S3 console, look for a bucket starting with mwaairflowstack-mwaaenvstackne and go under dags.

You should see the new DAG.

  1. On the Amazon MWAA console, choose DAGs.

You should be able to see the new DAG.

Run your DAG on the Airflow UI

Go to the Airflow UI and toggle on the DAG.

This triggers your DAG automatically.

Later, you can continue manually triggering it by choosing the run icon.

Choose the DAG and Graph View to see the run of your DAG.

If you have any issue, you can check the logs of the failed tasks from the task instance context menu.

Display your data in Amazon S3 (with S3 Select)

To display your data, complete the following steps:

  1. On the Amazon S3 console, in the Buckets list, choose the name of the bucket that contains the output of the Salesforce data (airflow-blog-post).
  2. In the Objects list, choose the name of the folder that has the object that you copied from Salesforce (opportunity).
  3. Choose the raw folder and the dt folder with the latest timestamp.
  4. Select any file.
  5. On the Actions menu, choose Query with S3 Select.
  6. Choose Run SQL query to preview the data.

Clean up

To avoid incurring future charges, delete the AWS CloudFormation stack and the resources that you deployed as part of this post.

  1. On the AWS CloudFormation console, delete the stack MWAAirflowStack.

To clean up the deployed resources using the AWS Command Line Interface (AWS CLI), you can simply run the following command:

cdk destroy MWAAirflowStack

Make sure you are in the root path of the project when you run the command.

After confirming that you want to destroy the CloudFormation stack, the solution’s resources are deleted from your AWS account.

The following screenshot shows the process of deploying the stack:

The following screenshot confirms the stack is undeployed.

  1. Navigate to the Amazon S3 console and locate the two buckets containing mwaairflowstack-mwaaenvstack and mwaairflowstack-mwaaproj that were created during the deployment.
  2. Select each bucket delete its contents, then delete the bucket.
  3. Delete the IAM role created to write on the S3 buckets.

Conclusion

ENGIE discovered significant value by using Amazon MWAA, enabling its global business units to ingest data in more productive ways. This post presented how ENGIE scaled their data ingestion pipelines using Amazon MWAA. The first part of the post described the architecture components and how to successfully deploy a CI/CD pipeline for an Amazon MWAA project template using a CodeCommit repository and plug it into CodePipeline to build, test, and package the code and custom plugins. The second part walked you through the steps to automate the ingestion process from Salesforce using Airflow with an example. For the Airflow configuration, you used Airflow variables, but you can also use Secrets Manager with Amazon MWAA using the secretsBackend parameter when deploying the stack.

The use case discussed in this post is just one example of how you can use Amazon MWAA to make it easier to set up and operate end-to-end data pipelines in the cloud at scale. For more information about Amazon MWAA, check out the User Guide.


About the Authors

Anouar Zaaber is a Senior Engagement Manager in AWS Professional Services. He leads internal AWS, external partner, and customer teams to deliver AWS cloud services that enable the customers to realize their business outcomes.

Amine El Mallem is a Data/ML Ops Engineer in AWS Professional Services. He works with customers to design, automate, and build solutions on AWS for their business needs.

Armando Segnini is a Data Architect with AWS Professional Services. He spends his time building scalable big data and analytics solutions for AWS Enterprise and Strategic customers. Armando also loves to travel with his family all around the world and take pictures of the places he visits.

Mohamed-Ali Elouaer is a DevOps Consultant with AWS Professional Services. He is part of the AWS ProServe team, helping enterprise customers solve complex problems related to automation, security, and monitoring using AWS services. In his free time, he likes to travel and watch movies.

Julien Grinsztajn is an Architect at ENGIE. He is part of the Digital & IT Consulting ENGIE IT team working on the definition of the architecture for complex projects related to data integration and network security. In his free time, he likes to travel the oceans to meet sharks and other marine creatures.

Creating a Multi-Region Application with AWS Services – Part 2, Data and Replication

Post Syndicated from Joe Chapman original https://aws.amazon.com/blogs/architecture/creating-a-multi-region-application-with-aws-services-part-2-data-and-replication/

In Part 1 of this blog series, we looked at how to use AWS compute, networking, and security services to create a foundation for a multi-Region application.

Data is at the center of many applications. In this post, Part 2, we will look at AWS data services that offer native features to help get your data where it needs to be.

In Part 3, we’ll look at AWS application management and monitoring services to help you build, monitor, and maintain a multi-Region application.

Considerations with replicating data

Data replication across the AWS network can happen quickly, but we are still limited by the speed of light. For this reason, data consistency must be considered when building a multi-Region application. Generally speaking, the longer a physical distance is, the longer it will take the data to get there.

When building a distributed system, consider the consistency, availability, partition tolerance (CAP) theorem. This theorem states that an application can only pick 2 out of the 3, and tradeoffs should be considered.

  • Consistency – all clients always have the same view of data
  • Availability – all clients can always read and write data
  • Partition Tolerance – the system will continue to work despite physical partitions

CAP diagram

Achieving consistency and availability is common for single-Region applications. For example, when an application connects to a single in-Region database. However, this becomes more difficult with multi-Region applications due to the latency added by transferring data over long distances. For this reason, highly distributed systems will typically follow an eventual consistency approach, favoring availability and partition tolerance.

Replicating objects and files

To ensure objects are in multiple Regions, Amazon Simple Storage Service (Amazon S3) can be set up to replicate objects across AWS Regions automatically with one-way or two-way replication. A subset of objects in an S3 bucket can be replicated with S3 replication rules. If low replication lag is critical, S3 Replication Time Control can help meet requirements by replicating 99.99% of objects within 15 minutes, and most within seconds. To monitor the replication status of objects, Amazon S3 events and metrics will track replication and can send an alert if there’s an issue.

Traditionally, each S3 bucket has its own single, Regional endpoint. To simplify connecting to and managing multiple endpoints, S3 Multi-Region Access Points create a single global endpoint spanning multiple S3 buckets in different Regions. When applications connect to this endpoint, it will route over the AWS network using AWS Global Accelerator to the bucket with the lowest latency. Failover routing is also automatically handled if the connectivity or availability to a bucket changes.

For files stored outside of Amazon S3, AWS DataSync simplifies, automates, and accelerates moving file data across Regions and accounts. It supports homogeneous and heterogeneous file migrations across Elastic File System (Amazon EFS), Amazon FSx, AWS Snowcone, and Amazon S3. It can even be used to sync on-premises files stored on NFS, SMB, HDFS, and self-managed object storage to AWS for hybrid architectures.

File and object replication should be expected to be eventually consistent. The rate at which a given dataset can transfer is a function of the amount of data, I/O bandwidth, network bandwidth, and network conditions.

Copying backups

Scheduled backups can be set up with AWS Backup, which automates backups of your data to meet business requirements. Backup plans can automate copying backups to one or more AWS Regions or accounts. A growing number of services are supported, and this can be especially useful for services that don’t offer real-time replication to another Region such as Amazon Elastic Block Store (Amazon EBS) and Amazon Neptune.

Figure 1 shows how these data transfer services can be combined for each resource.

Storage replication services

Figure 1. Storage replication services

Spanning non-relational databases across Regions

Amazon DynamoDB global tables provide multi-Region and multi-writer features to help you build global applications at scale. A DynamoDB global table is the only AWS managed offering that allows for multiple active writers in a multi-Region topology (active-active and multi-Region). This allows for applications to read and write in the Region closest to them, with changes automatically replicated to other Regions.

Global reads and fast recovery for Amazon DocumentDB (with MongoDB compatibility) can be achieved with global clusters. These clusters have a primary Region that handles write operations. Dedicated storage-based replication infrastructure enables low-latency global reads with a lag of typically less than one second.

Keeping in-memory caches warm with the same data across Regions can be critical to maintain application performance. Amazon ElastiCache for Redis offers global datastore to create a fully managed, fast, reliable, and secure cross-Region replica for Redis caches and databases. With global datastore, writes occurring in one Region can be read from up to two other cross-Region replica clusters – eliminating the need to write to multiple caches to keep them warm.

Spanning relational databases across Regions

For applications that require a relational data model, Amazon Aurora global database provides for scaling of database reads across Regions in Aurora PostgreSQL-compatible and MySQL-compatible editions. Dedicated replication infrastructure utilizes physical replication to achieve consistently low replication lag that outperforms the built-in logical replication database engines offer, as shown in Figure 2.

SysBench OLTP (write-only) stepped every 600 seconds on R4.16xlarge

Figure 2. SysBench OLTP (write-only) stepped every 600 seconds on R4.16xlarge

With Aurora global database, one primary Region is designated as the writer, and secondary Regions are dedicated to reads. Aurora MySQL supports write forwarding, which forwards write requests from a secondary Region to the primary Region to simplify logic in application code. Failover testing can happen by utilizing managed planned failover, which will change the active write cluster to another Region while keeping the replication topology intact. All databases discussed in this post employ eventual consistency when used across Regions, but Aurora PostgreSQL has an option to set the maximum a replica lag allowed with managed recovery point objective (managed RPO).

Logical replication, which utilizes a database engine’s built-in replication technology, can be set up for Amazon Relational Database Service (Amazon RDS) for MariaDB, MySQL, Oracle, PostgreSQL, and Aurora databases. A cross-Region read replica will receive these changes from the writer in the primary Region. For applications built on RDS for Microsoft SQL Server, cross-Region replication can be achieved by utilizing the AWS Database Migration Service. Cross-Region replicas allow for quicker local reads and can reduce data loss and recovery times in the case of a disaster by being promoted to a standalone instance.

For situations where a longer RPO and recovery time objective (RTO) are acceptable, backups can be copied across Regions. This is true for all of the relational and non-relational databases mentioned in this post, except for ElastiCache for Redis. Amazon Redshift can also automatically do this for your data warehouse. Backup copy times will vary depending on size and change rates.

A purpose-built database strategy offers many benefits, Figure 3 forms a purpose-built global database architecture.

Purpose-built global database architecture

Figure 3. Purpose-built global database architecture

Summary

Data is at the center of almost every application. In this post, we reviewed AWS services that offer cross-Region data replication to get your data where it needs to be quickly. Whether you need faster local reads, an active-active database, or simply need your data durably stored in a second Region, we have a solution for you. In the 3rd and final post of this series, we’ll cover application management and monitoring features.

Ready to get started? We’ve chosen some AWS Solutions, AWS Blogs, and Well-Architected labs to help you!

Related posts

Using Amazon Aurora Global Database for Low Latency without Application Changes

Post Syndicated from Roneel Kumar original https://aws.amazon.com/blogs/architecture/using-amazon-aurora-global-database-for-low-latency-without-application-changes/

Deploying global applications has many challenges, especially when accessing a database to build custom pages for end users. One example is an application using AWS [email protected]. Two main challenges include performance and availability.

This blog explains how you can optimally deploy a global application with fast response times and without application changes.

The Amazon Aurora Global Database enables a single database cluster to span multiple AWS Regions by asynchronously replicating your data within subsecond timing. This provides fast, low-latency local reads in each Region. It also enables disaster recovery from Region-wide outages using multi-Region writer failover. These capabilities minimize the recovery time objective (RTO) of cluster failure, thus reducing data loss during failure. You will then be able to achieve your recovery point objective (RPO).

However, there are some implementation challenges. Most applications are designed to connect to a single hostname with atomic, consistent, isolated, and durable (ACID) consistency. But Global Aurora clusters provide reader hostname endpoints in each Region. In the primary Region, there are two endpoints, one for writes, and one for reads. To achieve strong  data consistency, a global application requires the ability to:

  • Choose the optimal reader endpoints
  • Change writer endpoints on a database failover
  • Intelligently select the reader with the most up-to-date, freshest data

These capabilities typically require additional development.

The Heimdall Proxy coupled with Amazon Route 53 allows edge-based applications to access the Aurora Global Database seamlessly, without  application changes. Features include automated Read/Write split with ACID compliance and edge results caching.

Figure 1. Heimdall Proxy architecture

Figure 1. Heimdall Proxy architecture

The architecture in Figure 1 shows Aurora Global Databases primary Region in AP-SOUTHEAST-2, and secondary Regions in AP-SOUTH-1 and US-WEST-2. The Heimdall Proxy uses latency-based routing to determine the closest Reader Instance for read traffic, and redirects all write traffic to the Writer Instance. The Heimdall Configuration stores the Amazon Resource Name (ARN) of the global cluster. It automatically detects failover and cross-Region on the cluster, and directs traffic accordingly.

With an Aurora Global Database, there are two approaches to failover:

  • Managed planned failover. To relocate your primary database cluster to one of the secondary Regions in your Aurora global database, see Managed planned failovers with Amazon Aurora Global Database. With this feature, RPO is 0 (no data loss) and it synchronizes secondary DB clusters with the primary before making any other changes. RTO for this automated process is typically less than that of the manual failover.
  • Manual unplanned failover. To recover from an unplanned outage, you can manually perform a cross-Region failover to one of the secondaries in your Aurora Global Database. The RTO for this manual process depends on how quickly you can manually recover an Aurora global database from an unplanned outage. The RPO is typically measured in seconds, but this is dependent on the Aurora storage replication lag across the network at the time of the failure.

The Heimdall Proxy automatically detects Amazon Relational Database Service (RDS) / Amazon Aurora configuration changes based on the ARN of the Aurora Global cluster. Therefore, both managed planned and manual unplanned failovers are supported.

Solution benefits for global applications

Implementing the Heimdall Proxy has many benefits for global applications:

  1. An Aurora Global Database has a primary DB cluster in one Region and up to five secondary DB clusters in different Regions. But the Heimdall Proxy deployment does not have this limitation. This allows for a larger number of endpoints to be globally deployed. Combined with Amazon Route 53 latency-based routing, new connections have a shorter establishment time. They can use connection pooling to connect to the database, which reduces overall connection latency.
  2. SQL results are cached to the application for faster response times.
  3. The proxy intelligently routes non-cached queries. When safe to do so, the closest (lowest latency) reader will be used. When not safe to access the reader, the query will be routed to the global writer. Proxy nodes globally synchronize their state to ensure that volatile tables are locked to provide ACID compliance.

For more information on configuring the Heimdall Proxy and Amazon Route 53 for a global database, read the Heimdall Proxy for Aurora Global Database Solution Guide.

Download a free trial from the AWS Marketplace.

Resources:

Heimdall Data, based in the San Francisco Bay Area, is an AWS Advanced ISV partner. They have AWS Service Ready designations for Amazon RDS and Amazon Redshift. Heimdall Data offers a database proxy that offloads SQL improving database scale. Deployment does not require code changes.

How Meshify Built an Insurance-focused IoT Solution on AWS

Post Syndicated from Grant Fisher original https://aws.amazon.com/blogs/architecture/how-meshify-built-an-insurance-focused-iot-solution-on-aws/

The ability to analyze your Internet of Things (IoT) data can help you prevent loss, improve safety, boost productivity, and even develop an entirely new business model. This data is even more valuable, with the ever-increasing number of connected devices. Companies use Amazon Web Services (AWS) IoT services to build innovative solutions, including secure edge device connectivity, ingestion, storage, and IoT data analytics.

This post describes Meshify’s IoT sensor solution, built on AWS, that helps businesses and organizations prevent property damage and avoid loss for the property-casualty insurance industry. The solution uses real-time data insights, which result in fewer claims, better customer experience, and innovative new insurance products.

Through low-power, long-range IoT sensors, and dedicated applications, Meshify can notify customers of potential problems like rapid temperature decreases that could result in freeze damage, or rising humidity levels that could lead to mold. These risks can then be averted, instead of leading to costly damage that can impact small businesses and the insurer’s bottom line.

Architecture building blocks

The three building blocks of this technical architecture are the edge portfolio, data ingestion, and data processing and analytics, shown in Figure 1.

Figure 1. Building blocks of Meshify’s technical architecture

Figure 1. Building blocks of Meshify’s technical architecture

I. Edge portfolio (EP)

Starting with the edge sensors, the Meshify edge portfolio covers two types of sensors:

  • LoRaWAN (Low power, long range WAN) sensor suite. This sensor provides the long connectivity range (> 1000 feet) and extended battery life (~ 5 years) needed for enterprise environments.
  • Cellular-based sensors. This sensor is a narrow band/LTE-M device that operates at LTE-M band 2/4/12 radio frequency and uses edge intelligence to conserve battery life.

II. Data ingestion (DI)

For the LoRaWAN solution, aggregated sensor data at the Meshify gateway is sent to AWS using AWS IoT Core and Meshify’s REST service endpoints. AWS IoT Core is a managed cloud platform that lets IoT devices easily and securely connect using multiple protocols like HTTP, MQTT, and WebSockets. It expands its protocol coverage through a new fully managed feature called AWS IoT Core for LoRaWAN. This gives Meshify the ability to connect LoRaWAN wireless devices with the AWS Cloud. AWS IoT Core for LoRaWAN delivers a LoRaWAN network server (LNS) that provides gateway management using the Configuration and Update Server (CUPS) and Firmware Updates Over-The-Air (FUOTA) capabilities.

III. Data processing and analytics (DPA)

Initial processing of the data is done at the ingestion layer, using Meshify REST API endpoints and the Rules Engine of AWS IoT Core. Meshify applies filtering logic to route relevant events to Amazon Managed Streaming for Apache Kafka (Amazon MSK). Amazon MSK is an AWS streaming data service that manages Apache Kafka infrastructure and operations, streamlining the process of running Apache Kafka applications on AWS.

Meshify’s applications then consume the events from Amazon MSK per the configured topic subscription. They enrich and correlate the events with the records with a managed service, Amazon Relational Database Service (RDS). These applications run as scalable containers on another managed service, Amazon Elastic Kubernetes Service (EKS), which runs container applications.

Bringing it all together – technical workflow

In Figure 2, we illustrate the technical workflow from the ingestion of field events to their processing, enrichment, and persistence. Finally, we use these events to power risk avoidance decision-making.

Figure 2. Technical workflow for Meshify IoT architecture

Figure 2. Technical workflow for Meshify IoT architecture

  1. After installation, Meshify-designed LoRa sensors transmit information to the cloud through Meshify’s gateways. LoRaWAN capabilities create connectivity between the sensors and the gateways. They establish a low power, wide area network protocol that securely transmits data over a long distance, through walls and floors of even the largest buildings.
  2. The Meshify Gateway is a redundant edge system, capable of sending sensor data from various sensors to the Meshify cloud environment. Once the LoRa sensor information is received by the Meshify Gateway, it converts the incoming radio frequency (RF) signals, which support faster transfer rate to Meshify’s cloud environment.
  3. Data from the Meshify Gateway and sensors is initially processed at Meshify’s AWS IoT Core and REST service endpoints. These destinations for IoT streaming data help with the initial intake and introduce field data to the Meshify cloud environment. The initial ingestion points can scale automatically based upon the volume of sensor data received. This enables rapid scaling and ease of implementation.
  4. After the data has entered the Meshify cloud environment, Meshify uses Amazon EKS and Amazon MSK to process the incoming data stream. Amazon MSK producer and consumer applications within the EKS systems enrich the data streams for the end users and systems to consume.
  5. Producer applications running on EKS send processed events to the Amazon MSK service. These events include storing and retrieval of raw data, enriched data, and system-level data.
  6. Consumer applications hosted on the EKS pods receive events per the subscribed Amazon MSK topic. Web, mobile, and analytic applications enrich and use these data streams to display data to end users, business teams, and systems operations.
  7. Processed events are persisted in Amazon RDS. The databases are used for reporting, machine learning, and other analytics and processing services.

Building a scalable IoT solution

Meshify first began work on the Meshify sensors and hosted platform in 2012. In the ensuing decade, Meshify has successfully created a platform to auto-scale upon demand with steady, predictable performance. This gave Meshify both the ability to use only the resources needed, and still have the capacity to handle unexpected voluminous data.

As the platform scaled, so did the volume of sensor data, operations and diagnostics data, and metadata from installations and deployments. Building an end-to-end data pipeline that integrates these different data sources and delivers co-related insights at low latency was time well spent.

Conclusion

In this post, we’ve shown how Meshify is using AWS services to power their suite of IoT sensors, software, and data platforms. Meshify’s most important architectural enhancements have involved the introduction of managed services, notably AWS IoT Core for LoRaWAN and Amazon MSK. These improvements have primarily focused on the data ingestion, data processing, and analytics stages.

Meshify continues to power the data revolution at the intersection of IoT and insurance at the edge, using AWS. Looking ahead, Meshify and HSB are excited at the prospect of scaling the relationship with AWS from cloud computing to the world of edge devices.

Learn more about how emerging startups and large enterprises are using AWS IoT services to build differentiated products.

Meshify is an IoT technology company and subsidiary of HSB, based in Austin, TX. Meshify builds pioneering sensor hardware, software, and data analytics solutions that protect businesses from property and equipment damage.