Tag Archives: database

Disaster Recovery for Oracle Database on Amazon EC2 with Fast-Start Failover

Post Syndicated from Harshad Gohil original https://aws.amazon.com/blogs/architecture/disaster-recovery-for-oracle-database-on-amazon-ec2-with-fast-start-failover/

High availability is non-negotiable for organizations today to prevent business-critical application disruptions. Enterprises must prioritize database scalability and availability to avoid downtime in their databases, network, servers, or storage environments.

For organizations that want to avoid required application changes, Oracle Real Application Clusters (RAC) is an option for providing high availability and scalability to the Oracle database. While the RAC feature is not supported by Oracle databases on Amazon Elastic Compute Cloud (Amazon EC2), Oracle Active Data Guard helps achieve high availability on AWS cloud.

The Oracle Data Guard feature helps customers survive disasters and data corruption while creating, maintaining, and managing one or more synchronized standby databases. But further, configuring Oracle Data Guard Fast-Start Failover (FSFO) helps achieve high availability.

In this blog post, we provide an architectural solution to achieve database high availability when running Oracle Database on Amazon EC2 with Oracle Data Guard along with Fast-Start Failover to address Availability Zones (AZs) or Amazon EC2 instance failures. We also introduce the steps you can take to make database failover happen without manual intervention, and offer recommendations for cross-Region disaster recovery.

Solution overview

Let’s explore this solution by discussing the architecture and two alternate options for securing high availability using Oracle Data Guard, along with the advantages and limitations of each. We will then offer a walkthrough of steps to make database failover happen without manual intervention.

Oracle high availability using Oracle Data Guard with multi-AZ and multi-Region with multi-AZ setup

This architecture is recommended to maintain high availability for Oracle databases on Amazon EC2 with protection against Amazon EC2 service outages in a Region. A disaster recovery environment and higher resiliency are provided after an Amazon EC2 service outage. This protects against Amazon EC2 service outages in an AWS Region and maintains resiliency due to the multi-AZ setup in a secondary Region.

In this architecture, Oracle Data Guard Fast Sync replication exists between the Primary database in AZ 1 in Region A, with standbys in AZ 2 Region A (Fast Sync), AZ1 in Region B (ASYNC), and AZ2 in Region B (ASYNC). There is an asynchronous cascading replication setup between standby databases to avoid network latency issues across regions.

Should Region A experience an Amazon EC2 service outage, the Oracle observer, a client software that monitors Oracle Data Guard and initiate failover to the Standby database in Region B. Applications can continue to connect to the database resulting in high availability with limited/minimal data loss based on the data change rate amount, as in Figure 1.

Oracle with cascading standby databases across regions

Figure 1. Oracle with cascading standby databases across regions

Using Oracle RedoRoutes, the default behavior of Data guard can be controlled and it can be set using the following example during setup.

Oracle RedoRoutes setup example:

dgmgrl > edit database DB_1A set property RedoRoutes= ‘ (LOCAL: DB_1B FASTSYNC PRIORITY=1, DB_2A ASYNC PRIORITY=2,DB_2B ASYNC PRIORITY=3)) (DB_1B: (DB_2A ASYNC PRIORITY=1, DB_2B ASYNC PRIORITY=2)) (DB_2A: DB_1B ASYNC) (DB_2B: DB_1B ASYNC)’

dgmgrl > edit database DB_1B set property RedoRoutes= ‘(LOCAL: (DB_1A FASTSYNC PRIORITY=1, DB_2A ASYNC PRIORITY=2,DB_2B ASYNC PRIORITY=3))(DB_1A: (DB_2A ASYNC PRIORITY=1, DB_2B ASYNC PRIORITY=2)) ‘

dgmgrl > edit database DB_1B set property RedoRoutes= ‘(LOCAL: (DB_2B FASTSYNC PRIORITY=1, DB_1A ASYNC PRIORITY=2, DB_1B ASYNC PRIORITY=3))(DB_2B: (DB_1A ASYNC PRIORITY=1, DB_1B ASYNC PRIORITY=2)) (DB_1A: DB_2B ASYNC)(DB_1B: DB_2B ASYNC )’

dgmgrl > edit database DB_1B set property RedoRoutes= ‘(LOCAL: (DB_2A FASTSYNC PRIORITY=1, DB_1A ASYNC PRIORITY=2, DB_1B ASYNC PRIORITY=3))(DB_2A: (DB_1A ASYNC PRIORITY=1, DB_1B ASYNC PRIORITY=2))’

For more information on Oracle RedoRoutes setup for Oracle Cascading Standby, refer to this step-by-step configuration documentation.

Database failover with Amazon Route 53 and Oracle Data Guard

The following walkthrough defines the steps you can take to make database failover happen without manual intervention using Amazon Route 53 and Oracle Data Guard.

Prerequisites

Before getting started, review the following prerequisites for this solution:

Walkthrough

Step 1. Create Oracle Database Service

For applications to connect without manual intervention on event of failure, we recommend creating an Oracle database service using the Oracle DBMS_Package called DBMS_SERVICE.

exec dbms_service.CREATE_SERVICE(SERVICE_NAME=>'DB_SERVICE_FOR_APP', NETWORK_NAME=>'DB_SERVICE_FOR_APP');

exec dbms_service.START_SERVICE('DB_SERVICE_FOR_APP');

Step 2. Network configuration

Applications can connect to the database seamlessly without manual intervention in an event of a failover from the Primary database to Standby using the Oracle Transparent Application Failover (TAF) approach, though TAF requires updating application connection strings in case of a host IP change.

The following approach using Amazon Route 53 is recommended for added flexibility and scalability. Route 53 has DNS A records that map to the database instance IPs and CNAME records that can redirect DNS queries to A records. The following depicts the DNS mapping. The CNAME, along with the database service name, can be used by the application in its network configuration.

Database_Name =
 (DESCRIPTION =
    (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = <db_cname>)(PORT = 1521))
   (connect_data = 
       (service_name = <db_service_name>)
   )) )

To update the CNAME in Route 53 to map to the Primary host automatically in the event of failure, follow these steps.

Step 3. Route 53 setup

Create a script named route53update.sh and place it on the database hosts using the following code.

#!/bin/bash

export ORACLE_HOME="<<change>> "

export LD_LIBRARY_PATH=$ORACLE_HOME/lib

export PATH=$ORACLE_HOME/bin:$PATH:/usr/local/bin:/usr/bin

LOG_FILE="/tmp/switch_dns_$$.log"

DNS_DOMAIN="<<change>> "

ACTIVE_DB_CNAME="<<change>> "

HOSTED_ZONE_ID="<<change>> "

TTL="<<change>> "

update_dns () {

TMPFILE="/tmp/route53_dns_$$.log"

 cat > ${TMPFILE} << EOF

    {

      "Comment":"Updating DNS of record ${1}.${DNS_DOMAIN}",

      "Changes":[

        {

          "Action":"UPSERT",

          "ResourceRecordSet":{

            "ResourceRecords":[

              {

                "Value":"$2"

              }

            ],

            "Name":"${1}.${DNS_DOMAIN}.",

            "Type":"CNAME",

            "TTL":$TTL

          }

        }

      ]

    }

EOF

  /usr/local/bin/aws route53 change-resource-record-sets \

        --hosted-zone-id $HOSTED_ZONE_ID \

        --change-batch file://"$TMPFILE" >> "$LOG_FILE"

}

prim_uniq_sid=`$ORACLE_HOME/bin/sqlplus -s  / as sysdba <<EOF

set feedback off echo off lines 2000 head off

select upper(db_unique_name) from  v\\$dataguard_config where DEST_ROLE='PRIMARY DATABASE';

EOF`

prim_uniq_sid=`echo $prim_uniq_sid| sed 's/^[ \t]*//;s/[ \t]*$//'`

host_current=`$ORACLE_HOME/bin/tnsping ${prim_uniq_sid}|sed -n 's/\(.*Host\)\([^)]*\)\(.*\)/\2/pi' |sed 's/=//g'|sed 's/^[ \t]*//;s/[ \t]*$//'`

dns_current_host=`/usr/local/bin/aws route53 list-resource-record-sets --hosted-zone-id $HOSTED_ZONE_ID --query  "ResourceRecordSets[?Name == '${ACTIVE_DB_CNAME}.${DNS_DOMAIN}.'].ResourceRecords" --output text`

if [ "$host_current" != "$dns_current_host" ]; then

        update_dns ${ACTIVE_DB_CNAME} $host_current

fi

Step 4. Database job setup

Create a job in the Oracle Primary database to execute the shell script just introduced to initiate in the event of failover using the following code.

begin
  dbms_scheduler.create_job
  (
    job_name             => 'route53update',
    job_type             => 'executable',
    number_of_arguments  => 0,
    job_action           => '/<<location of script>>/ route53update.sh',
    auto_drop            => false
  );

  dbms_scheduler.enable('route53update');
end;
/

Step 5. Database trigger setup

In an event of a failure, the Primary will failover and the Standby starts up as the new Primary. A trigger needs to be created on the Primary database to execute the job on any failover to update the Route53 CNAME using the following code.

create or replace trigger SYS.Update_Route53_Record
AFTER STARTUP ON DATABASE
DECLARE
db_role varchar2(16);
db_mode varchar2(20);BEGIN
select database_role, open_mode into db_role, db_mode from v$database;
if db_role = 'PRIMARY' then
dbms_scheduler.run_job('route53update') ;
END IF;
END;
/

Alternate Option 1: Single Region with multi-AZ

This option is a minimum recommended configuration to maintain high availability for Oracle databases on Amazon EC2 for customers who do not have a multi-region setup.

  • Advantage: Protects against Amazon EC2 service outage in a single AZ.
  • Limitation: Does not protect against Amazon EC2 service outages in a single Region.

In this architecture, Oracle Data Guard Fast Sync replication exists between the Oracle database instance in a multi-AZ setup with the Primary database (Read Write) in AZ 1 and the Standby database (Read Only) in AZ 2.

If the primary database is unreachable due to any failure, the observer will failover to the standby database in a different AZ. Applications can continue to connect to the database with zero data loss due to synchronous replication between AZ using the Maximum Availability/Maximum Protection mode setup in Oracle Data Guard. If the primary database is in us-east-1a and standby in us-east-1b, the RedoRoutes property can be defined as follows.

Oracle RedoRoutes setup example:

dgmgrl> edit database DB_1A set property RedoRoutes= '(LOCAL: (DB_1B FASTSYNC)'

dgmgrl>  edit database DB_1B set property RedoRoutes= '(LOCAL: (DB_1A FASTSYNC)'

For more information on how disaster recovery works in the AWS Cloud, visit the Disaster recovery is different in the cloud section of the AWS Well-Architected Framework. For more on Oracle RedoRoutes setup, refer to the Oracle Redo Routing Rules documentation.

Alternate Option 2: Multi-AZ with multi-Region with single AZ

This option is recommended to maintain high availability for an Oracle database on Amazon EC2 for customers who need multi-region availability. It provides protection against the rare unavailability of Amazon EC2 instances in the primary Region, in which case a disaster recovery environment is provided.

  • Advantage: Protects against Amazon EC2 service outages in a 2 AZ or AWS Region.
  • Limitation: Decreased resiliency without high availability on Amazon EC2 service outage in an entire Region

In this architecture, Oracle Data Guard Fast Sync replication exists between the Oracle database instance in multi-AZ within the single Region, with the Primary database in AZ 1 in Region A and Standby database in AZ 2 in Region A. There is an asynchronous replication setup between the Standby database cross-Region.

Asynchronous replication is recommended between Region replication to avoid network latency issue. A cascading standby setup ensures there is no additional performance impact on the primary database to send data to multiple standbys.

If the primary database is unreachable, failover happens between AZs in Region A. In the event of an Amazon EC2 service outage in a Region, failover occurs to Region B, resulting in high availability with minimal data loss based on the data change rate amount. If the primary database is in us-east-1a and standby in us-east-1b (Fast Sync) and us-east-2a (Async), the RedoRoutes property can be defined as follows.

Oracle RedoRoutes setup example:

dgmgrl > edit database DB_1A set property RedoRoutes= '(LOCAL: (DB_1B FASTSYNC PRIORITY=1, DB_2A ASYNC PRIORITY=2))(DB_1B: DB_2A ASYNC)(DB_2A: DB_1B ASYNC)'

dgmgrl > edit database DB_1B set property RedoRoutes= '(LOCAL: (DB_1A FASTSYNC PRIORITY=1, DB_2A ASYNC  PRIORITY=2)) (DB_1A: DB_2A ASYNC)'

dgmgrl > edit database DB_1B set property RedoRoutes= '(LOCAL: (DB_1A FASTSYNC PRIORITY=1, DB_1B ASYNC  PRIORITY=2))'

Cleaning up

The services involved in this solution incur costs. When you’re done using this solution, clean up the following resources:

  • Amazon EC2 instances – Stop or delete (terminate) the Amazon EC2 instances that you provisioned.
  • Route53 – Delete the hosted Zone ID and A records/CNAMEs created.

Conclusion

This blog post demonstrates how high availability and disaster recovery can be achieved for an Oracle database on an Amazon EC2 instance using Oracle Data Guard. Using the architectures in this post, you can achieve zero data loss with the Oracle Fast-Start Failover option within the same Region or cross-Region on Amazon EC2.

You can also use this architecture to replicate data from an Oracle database on Amazon EC2 to an Oracle database hosted outside of the AWS cloud. With Oracle Cascading Standby and Oracle RedoRoutes, you can remove high dependency on the Primary database to improve overall performance.

Federate Amazon QuickSight access with open-source identity provider Keycloak

Post Syndicated from Ayah Chamseddin original https://aws.amazon.com/blogs/big-data/federate-amazon-quicksight-access-with-open-source-identity-provider-keycloak/

Amazon QuickSight is a scalable, serverless, embeddable, machine learning (ML) powered business intelligence (BI) service built for the cloud that supports identity federation in both Standard and Enterprise editions. Organizations are working toward centralizing their identity and access strategy across all their applications, including on-premises and third-party. Many organizations use Keycloak as their identity provider (IdP) to control and manage user authentication and authorization centrally. You can enable role-based access control to make sure users get appropriate role permissions in QuickSight based on their entitlement stored in Keycloak attributes.

In this post, we walk through the steps you need to configure federated single sign-on (SSO) between QuickSight and open-source IdP Keycloak. We also demonstrate ways to to assign QuickSight roles based on Keycloak membership. Administrators can publish QuickSight applications on the Keycloak Admin console. This enables you to SSO to QuickSight using your Keycloak credentials.

Prerequisites

To complete the walkthrough, you need the following prerequisites:

Solution overview

The walkthrough includes the following steps:

  1. Register a client application in Keycloak.
  2. Configure the application in Keycloak.
  3. Add Keycloak as your SAML IdP in AWS.
  4. Configure IAM policies.
  5. Configure IAM roles.
  6. Assign the newly created roles in IAM to users and groups in Keycloak.

Register a client application in KeyCloak

To configure the integration of an SSO application in Keycloak, you need to create a Keycloak client application.

  1. Sign in to your Keycloak admin dashboard.
    For instructions on installing Keycloak, refer to Keycloak Downloads. For the Keycloak admin dashboard, use http://localhost:8080/.
  2. Create a new realm by choosing Create realm on the default realm master page.
    Create realm in Keycloak user interface
  3. Assign a name for this new realm. For this example, we assign the name aws-realm.
    Add realm name in Keycloak user interface
  4. When the new realm has been created, choose Clients.
  5. Choose Create client to create a new Keycloak application for SSO Federation to QuickSight.
    Create client in Keycloak user interface

Configure the application in Keycloak

Follow the steps to configure the application in Keycloak.

  1. Download the SAML metadata file.
  2. Save full code from saml-metadata.xml to your local machine.
  3. In the navigation pane under Clients, import the SAML metadata file.
  4. Choose Import client.
  5. Choose Browse.
  6. Leave the rest of the fields blank. The metadata.xml file that you import later automatically populates them.
  7. When imported, press Save.
    Import client in Keycloak user interface
  8. On the Clients Application Setting page, choose the recently added client.
    Selecting client on Client Application Setting page
  9. Update the properties of the client ID:
    1. Change Home URL to /realms/aws-client/protocol/saml/clients/amazon-qs.
    2. Change the IdP Initiated SSO URL to amazon-qs.
    3. Change the IdP initiated SSO Relay State to https://quicksight.aws.amazon.com.
  10. On the Client scopes tab, choose the client ID.
  11. On the Scope tab, make sure the Full scope allowed toggle is set to off.
  12. Insert your specific host domain name where the Keycloak application resides in the following URL: https://<your_host_domain>/realms/aws-realm/protocol/saml/descriptor.
    1. Download the Keycloak IdP SAML metadata file from that URL location.

You now have Keycloak installed in your local machine, a new client added, AWS federation properties updated, and the Keycloak SAML metadata downloaded for AWS use in the following section.

Add Keycloak as your SAML IdP in AWS

To configure Keycloak as your SAML IdP, complete the following steps:

  1. Open a new tab in your browser.
  2. Sign in to the IAM console in your AWS account with admin permissions.
  3. On the IAM console, under Access Management in the navigation pane, choose Identity providers.
  4. Choose Add provider.
  5. For Provider type, select SAML.
  6. For Provider name, enter keycloak.
  7. For Metadata document, upload the Keycloak IdP SAML metadata XML file you downloaded and saved to your local machine earlier.
  8. Choose Add provider.
  9. Verify Keycloak has been added as an IAM IdP and copy the ARN assigned.

The ARN is used in a later step for federated users and IdP Keycloak advanced configuration.

Configure IAM policies

Create three IAM policies for mapping to three different roles with permissions in QuickSight (admin, author, and reader):

  • Admin – Uses QuickSight for authoring and for performing administrative tasks such as managing users or purchasing SPICE capacity
  • Author – Authors analyses and dashboards in QuickSight but doesn’t perform any administrative tasks
  • Reader – Interacts with shared dashboards, but doesn’t author analyses or dashboards or perform any administrative tasks

Use the following steps to setup the QuickSight-Admin policy. This policy grants the admin privileges in QuickSight to the federated user.

  1. On the IAM console, choose Policies.
  2. Choose Create policy.
  3. Choose JSON and replace the existing text with the code from the following table for QuickSight-Admin.

    Policy Name JSON Text
    QuickSight-Admin
    {
    "Version": "2012-10-17",
    "Statement": [
    	{
    		"Effect": "Allow",
    		"Action": "quicksight:CreateAdmin",
    		"Resource": "*"
    	}
    ]		
    }

    QuickSight-Author
    {
    "Version": "2012-10-17",
    "Statement": [
    	{
    	"Effect": "Allow",
    	"Action": "quicksight:CreateUser",
    	"Resource": "*"
    	}
    ]
    }

    QuickSight-Reader
    {
    "Version": "2012-10-17",
    "Statement": [
    	{
    		"Effect": "Allow",
    		"Action": " quicksight:CreateReader",
    		"Resource": "*"
    	}
    ]
    }

  4. Choose Review policy.
  5. For Name, enter QuickSight-Admin.
  6. Choose Create policy.
  7. Repeat the steps for QuickSight-Reader and QuickSight-Author.


Configure IAM roles

Create the roles that your Keycloak users assume when federating into QuickSight. Use the following steps to set up the admin role:

  1. On the IAM console, choose Roles in the navigation pane.
  2. Choose Create role.
  3. For Select type of trusted entity, choose SAML 2.0 federation.
  4. For SAML provider, choose the IdP you created earlier (keycloak).
  5. Select Allow programmatic and AWS Management Console access.
  6. Choose Next: Permissions.
  7. Choose the QuickSight-Admin IAM policy you created in the previous step.
  8. Choose Next: Name, review, and create.
  9. For Role name, enter QuickSight-Admin-Role.
  10. For Role description, enter a description.
  11. Choose Create role.
  12. Repeat these steps to create your author and reader roles and attach the appropriate policies:
    1. For QuickSight-Author-Role, use the policy QuickSight-Author
    2. For QuickSight-Reader-Role, use the policy QuickSight-Reader

With the completion of these steps, you have created an IdP in AWS, created policies, and created roles for the Keycloak IdP.

Assign the newly created roles in IAM to users and groups in Keycloak

To create a role for the client, complete the following steps:

  1. Log back in to the Keycloak admin console.
  2. Select aws-realm and client amazon:webservices.
  3. Choose Create Role.
    1. Provide a comma-separated string using the ARN for the IAM role and the ARN for the Keycloak IdP, as in the following example:
      arn:aws:iam:: <AWS account>:role/QuickSight-Admin-Role,arn:aws:iam::<AWS account>:saml-provider/keycloak
  4. When the role has been added successfully, choose Save.
  5. Repeat the steps to add QuickSight-Author-Role and QuickSight-Reader-Role.

Create mappers

To create a mapper for the client, complete the following steps:

  1. On the Client scopes tab, select the client amazon:webservices for aws-realm.
  2. On the Mappers tab, choose Add mapper.
  3. Choose By configuration to generate mappers for Session Role, Session Duration, and Session Name.
  4. Add the values needed for the Session Role mapper:
    1. Name: Session Role
    2. Mapper type: Role list
    3. Friendly Name: Session Role
    4. Role attribute name: https://aws.amazon.com/SAML/Attributes/Role
    5. SAML Attribute NameFormat: Basic
  5. Add the values needed for the Session Duration mapper:
    1. Name: Session Duration
    2. Mapper Type: Hardcoded attribute
    3. Friendly Name: Session Duration
    4. SAML Attribute Name: https://aws.amazon.com/SAML/Attributes/SessionDuration
    5. SAML Attribute NameFormat: Basic
    6. Attribute Value: 28800

      You can automatically sync user email mapping. To perform these steps, refer to Configure an automated email sync for federated SSO users to access Amazon QuickSight.

To manually add the values needed for the Session Name mapper, provide the following information:

  1. Namee: Session Name
  2. Mapper Type: User Property
  3. Property: username
  4. Friendly Name: Session Name
  5. SAML Attribute Name: https://aws.amazon.com/SAML/Attributes/SessionName
  6. SAML Attribute NameFormat: Basic

Create a sample group for Keycloak users

To create groups and users for the Keycloak IdP, complete the following steps:

  1. Choose Group in the navigation pane.
  2. Create a new group named READ_ONLY_AWS_USERS.
  3. Choose the Role mapping tab and Assign role.
  4. Add the role created for the client.
  5. Choose Assign.

Create a sample user

Complete these steps to create a sample user with credentials:

  1. Choose Users in the navigation pane.
  2. Choose Create new user.
  3. Create a sample user, such as John.
  4. Set the credentials for the created user.
  5. Add the sample user created in earlier to the group READ_ONLY_AWS_USERS.

You now have a Keycloak role for the realm and client, and Keycloak mappers, groups, and users in your groups.

Test the application

Let’s invoke the application you have created to seamlessly sign in to QuickSight using the following URL. Make sure you enter your domain for Keycloak.

http://<your domain>/realms/aws-realm/protocol/saml/clients/amazon-qs

When prompted for your user ID and password, enter the credentials that you created earlier.

Keycloak successfully validates the credentials and federates access to the QuickSight console by assuming the role.

Conclusion

In this post, we provided step-by-step instructions to configure federated SSO between Keycloak IdP and QuickSight. We also discussed how to create new roles and map users and groups in Keycloak to IAM for secure access to QuickSight.

If you have any questions or feedback, please leave a comment.


About the Authors

Ayah Chamseddin is a Sr. Engagement Manager at AWS. She has a deep understanding of cloud technologies and has successfully overseen and lead strategic projects, partnering with clients to define business objectives, develop implementation strategies, and drive the successful delivery of solutions.


Vamsi Bhadriraju
is a Data Architect at AWS. He works closely with enterprise customers to build data lakes and analytical applications on the AWS Cloud.


Srikanth Baheti
 is a Specialized World Wide Principal Solutions Architect for Amazon QuickSight. He started his career as a consultant and worked for multiple private and government organizations. Later he worked for PerkinElmer Health and Sciences & eResearch Technology Inc, where he was responsible for designing and developing high traffic web applications, highly scalable and maintainable data pipelines for reporting platforms using AWS services and Serverless computing.


Raji Sivasubramaniam
 is a Sr. Solutions Architect at AWS, focusing on Analytics. Raji is specialized in architecting end-to-end Enterprise Data Management, Business Intelligence and Analytics solutions for Fortune 500 and Fortune 100 companies across the globe. She has in-depth experience in integrated healthcare data and analytics with wide variety of healthcare datasets including managed market, physician targeting and patient analytics.

A Guide to Maintaining a Healthy Email Database

Post Syndicated from nnatri original https://aws.amazon.com/blogs/messaging-and-targeting/guide-to-maintaining-healthy-email-database/

Introduction

In the digital age, email remains a powerful tool for businesses to communicate with their customers. Whether it’s for marketing campaigns, customer service updates, or important announcements, a well-maintained email database is crucial for ensuring that your messages reach their intended recipients. However, managing an email database is not just about storing email addresses. It involves keeping the database healthy, which means it’s up-to-date, accurate, and filled with engaged subscribers.

Amazon Simple Email Service (SES) offers robust features that help businesses manage their email environments effectively. Trusted by customers such as Amazon.com, Netflix, Duolingo and Reddit, SES helps customers deliver high-volume email campaigns of hundreds of billions of emails per year. Introduced in 2020, the list and subscription management feature of Amazon SES has added a new dimension to email database management, thereby reducing effort and time-to-value of managing a subscription list by allowing you to manage your list of contacts via its REST API, SDK or AWS CLI.

In this blog post, we will delve into the world of email database management in Amazon SES. You will explore two ways to manage your email database: building out your own email database functionality and using the built-in list and subscription management service. You will also learn the pros and cons of each approach and provide examples of customer use cases that would benefit from each approach. Regardless of the approach you ultimately decide to take, the blog will also share updated strategies for email database management to help with improving deliverability and customer engagement.

This guide is designed to help you navigate the complexities of email database management and make informed decisions that best suit your business needs. So, whether you’re new to Amazon SES or looking to optimize your existing email database management practices, this guide is for you. Let’s get started!

Email Database Management in Amazon SES

Amazon Simple Email Service (SES) offers two primary ways to manage your email database: building out your own email database functionality and using the built-in list and subscription management service. Each approach has its own set of advantages and potential drawbacks, and the best choice depends on your specific use case and business needs.

Building Out Your Email Database Functionality

When you choose to build out your own email database functionality, you have the flexibility to customize the database to suit your specific needs and leverage SES’ scalability as an email channel to send email at high volumes to your customer. Depending on the business requirement, the customizations could involve creating custom fields for subscriber data, implementing complex logic for categorizing and segmenting users, or integrating with other systems in your tech stack.

Using the Built-in List and Subscription Management Service

Alternatively, you can look at Amazon SES’s built-in list and subscription management service, which offers a ready-made solution for managing your email database. It handles tasks such as managing subscriptions to different topics and maintaining your customer email database through contact lists. Additionally, you can insert up to two links per email to the subscription preference page, which allow users to manage their topic preferences within Amazon SES.

SubscriptionPage

The non-configurable subscription page will automatically populate the customer’s current subscribed topic and allow setting of granular topic’s preferences. More information on how to configure that can be found here.

The following table should serve as a guideline to help you with deciding your approach for Email Database Management.
Building Your Own Email Database Functionality Using Built-in List and Subscription Management Service
Pros

Customization: Full control over the database structure and functionality, allowing for tailoring to specific needs. This includes creating custom fields for subscriber data, implementing own algorithms for handling bounces and complaints, and integrating with other systems in the tech stack.

Integration: Flexible flow of data across the business due to the ability to integrate the email database with other systems in the tech stack. You’ve already built your own email database or have one in mind which supports querying, building that database external to Amazon SES would make for a more customizable implementation.

Data Ownership: When you manage your own database, you have full ownership and control over your data. This can be important for businesses with strict data governance or regulatory requirements.

Ease of Use: The built-in service provides readily-available API to create, update and delete contacts. These operations are also available via REST API, AWS CLI and SDK. Once you’ve set up the subscription topics and contact lists, you can leverage the preference center to allow your customers to easily sub/unsubscribe from different topics.

Cost-Effective: More cost-effective than building own functionality as it requires less time and resources. The built-in service is also available free of charge unlike building out own infrastructure which would require ongoing infrastructure service costs.

Cons

Time and Resources: Building your own email database functionality requires a significant investment of time and resources. This includes the initial setup of the database, designing the schema, setting up the servers, and configuring the database software. Additionally, you’ll need to develop the functionality for managing subscriptions, and database cleanup in upon receiving bounces and complaints. Databases require ongoing maintenance to ensure they remain operational and efficient. This includes tasks like updating the database software, managing backups, optimizing queries, and scaling the database as your subscriber base grows.

Complexity: As your subscriber base grows, managing your own email database can become increasingly complex. You’ll need to handle more data, which can slow down queries and make the database more difficult to manage. You’ll also need to deal with more complex issues like data integrity, redundancy, and normalization. Additionally, as you add more features to your email database functionality, the codebase can become more complex, making it harder to maintain and debug.

Security: When you manage your own email database, you’re responsible for its security. This includes protecting the data from unauthorized access, ensuring the confidentiality of your subscribers’ information, and complying with data protection regulations. You’ll need to implement security measures like encryption, access controls, and regular security audits. If your database is compromised, it could lead to data loss or a breach of your subscribers’ privacy, which could damage your reputation and potentially lead to legal consequences.

Limited Customization: The built-in service may not offer the same level of customization as building own functionality. It may not meet all needs if there are specific requirements. For example, the preference center management page cannot be customized.

Dependence: Using the built-in service means you’re reliant on Amazon SES for your email database management. If the service experiences downtime or issues, it could impact your ability to manage your email database. This could potentially disrupt your email campaigns and affect your relationship with your subscribers. Furthermore, if you decide to switch to a different email service provider in the future, migrating your email database from the built-in service could be a complex and time-consuming process. Additionally, if your email database needs to be accessed or manipulated by other systems in your tech stack, this dependency on Amazon SES could complicate the integration process and limit your flexibility.

Customer Use Cases Best suited for businesses with specific needs that aren’t met by standard list management services, or those who wish to integrate their email database with other systems. For example, a large e-commerce company might choose to build out their own email database functionality to integrate with their customer relationship management (CRM) and inventory systems. Ideal for small to medium-sized businesses that need a straightforward, cost-effective solution for managing their email database. It’s also a good fit for businesses without the resources or technical expertise to build their own email database functionality.

Strategies for Email Database Management with Amazon Simple Email Service

Once you’ve made the decision on whether to manage your email database within Amazon SES or build your own, that’s only half of the equation. It’s important to recognize that your email databases will only work best to serve the business needs when you have processes in place to maintain them. In this section, let’s go through some of the best practices on how to do so.

  • Maintaining email list hygiene:
    • Both Amazon SES and a custom-built email database require maintaining a healthy email list. This involves regularly cleaning your list to remove invalid email addresses, hard bounces, and unengaged subscribers. With Amazon SES, the process to handle hard bounces and complaints is automated.
    • With a custom-built email database, you have more control over how and when this cleaning occurs. Rather than focusing on only email addresses that either hard bounces or complained, you can remove unengaged users. Every business will have their own definition of an un-engaged users based on business needs. Regardless, you will need to store the engagement attribute (e.g. days since last interaction). This will be simpler to architect in an external database which supports querying and bulk modification.
  • Managing Subscriptions:
    • With Amazon SES, you can easily manage subscriptions using the built-in functionality. This includes adding new subscribers, removing unsubscribed users, and updating user topic preferences. However, you will not be able to customize the look-and-feel of your subscription preference pages.
    • If you build your own email database, you’ll need to create your own system for managing subscriptions, which could require significant time and resources. The trade-off is that you can fully customize your subscription management system to showcase your branding on the subscription preference page and also handle custom logic for subscription/unsubscription.
  • Encouraging Engagement: Low engagement rates can indicate that your recipients are not interested in your content. To stimulate action, you can include a survey in the email, ask for feedback, or run a giveaway. You can then filter out inactive subscribers who still aren’t interacting with your emails. For engaged subscribers, you can segment these audiences into sub-groups by preference and send tailored email marketing campaigns. Before removing less active subscribers, consider what other kinds of content you could provide that might be more appealing. Unengaged subscribers can sometimes be re-engaged with the right offer, such as a free gift, a special perk, or exclusive content.
  • Renewing Opt-In: For your disengaged subscribers, send a re-optin campaign and remove them if they don’t re-subscribe. Be transparent! Notify inactive subscribers that you’ve noticed their lack of engagement and let them know that you don’t want to clutter their inbox if they’re not interested. Ask them if they want to continue to receive emails with a clear call-to-action button that will re-sign them up for future emails.
  • Making It Easy to Unsubscribe: Including an easy-to-find unsubscribe button and a one-step opt-out process won’t encourage subscribers to leave if you’re giving them a reason to stay. If recipients feel like they can’t leave, they’ll just mark your emails as spam, which counts as a big strike against your sender reputation.

Remember, effective email database management is a continuous process that requires regular attention and maintenance. By following these best practices, you can maximize the effectiveness of your email marketing efforts and build strong relationships with your subscribers.

Conclusion

In conclusion, maintaining a healthy email database is a critical aspect of successful email marketing. Whether you choose to build out your own email database functionality or use Amazon SES’s built-in list and subscription management service, it’s important to understand the pros and cons of each approach and align your decision with your business needs.

Building your own email database functionality offers the advantage of customization and integration with other systems in your tech stack. However, it requires significant time, resources, and technical expertise. On the other hand, Amazon SES’s built-in service is easy to use, cost-effective, and handles many complexities of email database management, but it may not offer the same level of customization.

Regardless of the approach you choose, following best practices for email database management is essential. This includes handling bounces and complaints, managing subscriptions, encouraging engagement, sending re-engagement email campaigns, renewing opt-ins, and making it easy to unsubscribe.

These practices will help you maintain a healthy email list, improve engagement rates, and ultimately, enhance the effectiveness of your email marketing efforts.It’s important to stay updated with the latest trends and strategies in email database management. So, keep exploring, learning, and implementing the best practices that suit your business needs.

For more information on Amazon SES and its features, visit the Amazon SES Documentation. Here, you’ll find comprehensive guides, tutorials, and API references to help you make the most of Amazon SES.

Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers

Post Syndicated from Shaun Persad original http://blog.cloudflare.com/announcing-database-integrations/

Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers

Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers

One of the best feelings as a developer is seeing your idea come to life. You want to move fast and Cloudflare’s developer platform gives you the tools to take your applications from 0 to 100 within minutes.

One thing that we’ve heard slows developers down is the question: “What databases can be used with Workers?”. Developers stumble when it comes to things like finding the databases that Workers can connect to, the right library or driver that's compatible with Workers and translating boilerplate examples to something that can run on our developer platform.

Today we’re announcing Database Integrations  – making it seamless to connect to your database of choice on Workers. To start, we’ve added some of the most popular databases that support HTTP connections: Neon, PlanetScale and Supabase with more (like Prisma, Fauna, MongoDB Atlas) to come!

Focus more on code, less on config

Our serverless SQL database, D1, launched in open alpha last year, and we’re continuing to invest in making it production ready (stay tuned for an exciting update later this week!). We also recognize that there are plenty of flavours of databases, and we want developers to have the freedom to select what’s best for them and pair it with our powerful compute offering.

On our second day of this Developer Week 2023, data is in the spotlight. We’re taking huge strides in making it possible and more performant to connect to databases from Workers (spoiler alert!):

Making it possible and performant is just the start, we also want to make connecting to databases painless. Databases have specific protocols, drivers, APIs and vendor specific features that you need to understand in order to get up and running. With Database Integrations, we want to make this process foolproof.

Whether you’re working on your first project or your hundredth project, you should be able to connect to your database of choice with your eyes closed. With Database Integrations, you can spend less time focusing on configuration and more on doing what you love – building your applications!

What does this experience look like?

Discoverability

If you’re starting a project from scratch or want to connect Workers to an existing database, you want to know “What are my options?”.

Workers supports connections to a wide array of database providers over HTTP.  With newly released outbound TCP support, the databases that you can connect to on Workers will only grow!

In the new “Integrations” tab, you’ll be able to view all the databases that we support and add the integration to your Worker directly from here. To start, we have support for Neon, PlanetScale and Supabase with many more coming soon.

Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers

Authentication

You should never have to copy and paste your database credentials or other parts of the connection string.

Once you hit “Add Integration” we take you through an OAuth2 flow that automatically gets the right configuration from your database provider and adds them as encrypted environment variables to your Worker.

Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers

Once you have credentials set up, check out our documentation for examples on how to get started using the data platform’s client library. What’s more – we have templates coming that will allow you to get started even faster!

That’s it! With database integrations, you can connect your Worker with your database in just a few clicks. Head to your Worker > Settings > Integrations to try it out today.

What’s next?

We’ve only just scratched the surface with Database Integrations and there’s a ton more coming soon!

While we’ll be continuing to add support for more popular data platforms we also know that it's impossible for us to keep up in a moving landscape. We’ve been working on an integrations platform so that any database provider can easily build their own integration with Workers. As a developer, this means that you can start tinkering with the next new database right away on Workers.

Additionally, we’re working on adding wrangler support, so you can create integrations directly from the CLI. We’ll also be adding support for account level environment variables in order for you to share integrations across the Workers in your account.

We’re really excited about the potential here and to see all the new creations from our developers! Be sure to join Cloudflare’s Developer Discord and share your projects. Happy building!

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

Post Syndicated from Brendan Irvine-Broque original http://blog.cloudflare.com/workers-tcp-socket-api-connect-databases/

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

Today, we are excited to announce a new API in Cloudflare Workers for creating outbound TCP sockets, making it possible to connect directly to any TCP-based service from Workers.

Standard protocols including SSH, MQTT, SMTP, FTP, and IRC are all built on top of TCP. Most importantly, nearly all applications need to connect to databases, and most databases speak TCP. And while Cloudflare D1 works seamlessly on Workers, and some hosted database providers allow connections over HTTP or WebSockets, the vast majority of databases, both relational (SQL) and document-oriented (NoSQL), require clients to connect by opening a direct TCP “socket”, an ongoing two-way connection that is used to send queries and receive data. Now, Workers provides an API for this, the first of many steps to come in allowing you to use any database or infrastructure you choose when building full-stack applications on Workers.

Database drivers, the client code used to connect to databases and execute queries, are already using this new API. pg, the most widely used JavaScript database driver for PostgreSQL, works on Cloudflare Workers today, with more database drivers to come.

The TCP Socket API is available today to everyone. Get started by reading the TCP Socket API docs, or connect directly to any PostgreSQL database from your Worker by following this guide.

First — what is a TCP Socket?

TCP (Transmission Control Protocol) is a foundational networking protocol of the Internet. It is the underlying protocol that is used to make HTTP requests (prior to HTTP/3, which uses QUIC), to send email over SMTP, to query databases using database–specific protocols like MySQL, and many other application-layer protocols.

A TCP socket is a programming interface that represents a two-way communication connection between two applications that have both agreed to “speak” over TCP. One application (ex: a Cloudflare Worker) initiates an outbound TCP connection to another (ex: a database server) that is listening for inbound TCP connections. Connections are established by negotiating a three-way handshake, and after the handshake is complete, data can be sent bi-directionally.

Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers

A socket is the programming interface for a single TCP connection — it has both a readable and writable “stream” of data, allowing applications to read and write data on an ongoing basis, as long as the connection remains open.

connect() — A simpler socket API

With Workers, we aim to support standard APIs that are supported across browsers and non-browser environments wherever possible, so that as many NPM packages as possible work on Workers without changes, and package authors don’t have to write runtime-specific code. But for TCP sockets, we faced a challenge — there was no clear shared standard across runtimes. Node.js provides the net and tls APIs, but Deno implements a different API — Deno.connect. And web browsers do not provide a raw TCP socket API, though a WICG proposal does exist, and it is different from both Node.js and Deno.

We also considered how a TCP socket API could be designed to maximize performance and ergonomics in a serverless environment. Most networking APIs were designed well before serverless emerged, with the assumption that the developer’s application is also the server, responsible for directly handling configuring TLS options and credentials.

With this backdrop, we reached out to the community, with a focus on maintainers of database drivers, ORMs and other libraries that create outbound TCP connections. Using this feedback, we’ve tried to incorporate the best elements of existing APIs and proposals, and intend to contribute back to future standards, as part of the Web-interoperable Runtimes Community Group (WinterCG).

The API we landed on is a simple function, connect(), imported from the new cloudflare:sockets module, that returns an instance of a Socket. Here’s a simple example showing it used to connect to a Gopher server. Gopher was one of the Internet’s early protocols that relied on TCP/IP, and still works today:

import { connect } from 'cloudflare:sockets';

export default {
  async fetch(req: Request) {
    const gopherAddr = "gopher.floodgap.com:70";
    const url = new URL(req.url);

    try {
      const socket = connect(gopherAddr);

      const writer = socket.writable.getWriter()
      const encoder = new TextEncoder();
      const encoded = encoder.encode(url.pathname + "\r\n");
      await writer.write(encoded);

      return new Response(socket.readable, { headers: { "Content-Type": "text/plain" } });
    } catch (error) {
      return new Response("Socket connection failed: " + error, { status: 500 });
    }
  }
};

We think this API design has many benefits that can be realized not just on Cloudflare, but in any serverless environment that adopts this design:

connect(address: SocketAddress | string, options?: SocketOptions): Socket

declare interface Socket {
  get readable(): ReadableStream;
  get writable(): WritableStream;
  get closed(): Promise<void>;
  close(): Promise<void>;
  startTls(): Socket;
}

declare interface SocketOptions {
  secureTransport?: string;
  allowHalfOpen: boolean;
}

declare interface SocketAddress {
  hostname: string;
  port: number;
}

Opportunistic TLS (StartTLS), without separate APIs

Opportunistic TLS, a pattern of creating an initial insecure connection, and then upgrading it to a secure one that uses TLS, remains common, particularly with database drivers. In Node.js, you must use the net API to create the initial connection, and then use the tls API to create a new, upgraded connection. In Deno, you pass the original socket to Deno.startTls(), which creates a new, upgraded connection.

Drawing on a previous W3C proposal for a TCP Socket API, we’ve simplified this by providing one API, that allows TLS to be enabled, allowed, or used when creating a socket, and exposes a simple method, startTls(), for upgrading a socket to use TLS.

// Create a new socket without TLS. secureTransport defaults to "off" if not specified.
const socket = connect("address:port", { secureTransport: "off" })

// Create a new socket, then upgrade it to use TLS.
// Once startTls() is called, only the newly created socket can be used.
const socket = connect("address:port", { secureTransport: "starttls" })
const secureSocket = socket.startTls();

// Create a new socket with TLS
const socket = connect("address:port", { secureTransport: "use" })

TLS configuration — a concern of host infrastructure, not application code

Existing APIs for creating TCP sockets treat TLS as a library that you interact with in your application code. The tls.createSecureContext() API from Node.js has a plethora of advanced configuration options that are mostly environment specific. If you use custom certificates when connecting to a particular service, you likely use a different set of credentials and options in production, staging and development. Managing direct file paths to credentials across environments and swapping out .env files in production build steps are common pain points.

Host infrastructure is best positioned to manage this on your behalf, and similar to Workers support for making subrequests using mTLS, TLS configuration and credentials for the socket API will be managed via Wrangler, and a connect() function provided via a capability binding. Currently, custom TLS credentials and configuration are not supported, but are coming soon.

Start writing data immediately, before the TLS handshake finishes

Because the connect() API synchronously returns a new socket, one can start writing to the socket immediately, without waiting for the TCP handshake to first complete. This means that once the handshake completes, data is already available to send immediately, and host platforms can make use of pipelining to optimize performance.

connect() API + DB drivers = Connect directly to databases

Many serverless databases already work on Workers, allowing clients to connect over HTTP or over WebSockets. But most databases don’t “speak” HTTP, including databases hosted on most cloud providers.

Databases each have their own “wire protocol”, and open-source database “drivers” that speak this protocol, sending and receiving data over a TCP socket. Developers rely on these drivers in their own code, as do database ORMs. Our goal is to make sure that you can use the same drivers and ORMs you might use in other runtimes and on other platforms on Workers.

Try it now — connect to PostgreSQL from Workers

We’ve worked with the maintainers of pg, one of the most popular database drivers in the JavaScript ecosystem, used by ORMs including Sequelize and knex.js, to add support for connect().

You can try this right now. First, create a new Worker and install pg:

wrangler init
npm install --save pg

As of this writing, you’ll need to enable the node_compat option in wrangler.toml:

wrangler.toml

name = "my-worker"
main = "src/index.ts"
compatibility_date = "2023-05-15"
node_compat = true

In just 20 lines of TypeScript, you can create a connection to a Postgres database, execute a query, return results in the response, and close the connection:

index.ts

import { Client } from "pg";

export interface Env {
  DB: string;
}

export default {
  async fetch(
    request: Request,
    env: Env,
    ctx: ExecutionContext
  ): Promise<Response> {
    const client = new Client(env.DB);
    await client.connect();
    const result = await client.query({
      text: "SELECT * from customers",
    });
    console.log(JSON.stringify(result.rows));
    const resp = Response.json(result.rows);
    // Close the database connection, but don't block returning the response
    ctx.waitUntil(client.end());
    return resp;
  },
};

To test this in local development, use the --experimental-local flag (instead of –local), which uses the open-source Workers runtime, ensuring that what you see locally mirrors behavior in production:

wrangler dev --experimental-local

What’s next for connecting to databases from Workers?

This is only the beginning. We’re aiming for the two popular MySQL drivers, mysql and mysql2, to work on Workers soon, with more to follow. If you work on a database driver or ORM, we’d love to help make your library work on Workers.

If you’ve worked more closely with database scaling and performance, you might have noticed that in the example above, a new connection is created for every request. This is one of the biggest current challenges of connecting to databases from serverless functions, across all platforms. With typical client connection pooling, you maintain a local pool of database connections that remain open. This approach of storing a reference to a connection or connection pool in global scope will not work, and is a poor fit for serverless. Managing individual pools of client connections on a per-isolate basis creates other headaches — when and how should connections be terminated? How can you limit the total number of concurrent connections across many isolates and locations?

Instead, we’re already working on simpler approaches to connection pooling for the most popular databases. We see a path to a future where you don’t have to think about or manage client connection pooling on your own. We’re also working on a brand new approach to making your database reads lightning fast.

What’s next for sockets on Workers?

Supporting outbound TCP connections is only one half of the story — we plan to support inbound TCP and UDP connections, as well as new emerging application protocols based on QUIC, so that you can build applications beyond HTTP with Socket Workers.

Earlier today we also announced Smart Placement, which improves performance by placing any Worker that makes multiple HTTP requests to an origin run as close as possible to reduce round-trip time. We’re working on making this work with Workers that open TCP connections, so that if your Worker connects to a database in Virginia and makes many queries over a TCP connection, each query is lightning fast and comes from the nearest location on Cloudflare’s global network.

We also plan to support custom certificates and other TLS configuration options in the coming months — tell us what is a must-have in order to connect to the services you need to connect to from Workers.

Get started, and share your feedback

The TCP Socket API is available today to everyone. Get started by reading the TCP Socket API docs, or connect directly to any PostgreSQL database from your Worker by following this guide.

We want to hear your feedback, what you’d like to see next, and more about what you’re building. Join the Cloudflare Developers Discord.

Enable transparent connectivity to Oracle Data Guard environments using Amazon Route 53 CNAME records

Post Syndicated from Sudip Acharya original https://aws.amazon.com/blogs/architecture/enable-transparent-connectivity-to-oracle-data-guard-environments-using-amazon-route-53-cname-records/

Customers choose AWS for running their Oracle database workload to help increase resiliency, performance, and scalability of the database layer. A high availability (HA) solution for the database stack is an important aspect to consider when migrating or deploying Oracle databases in AWS to help ensure that the architecture can meet the service level agreement (SLA) of the application. Customers who run their Oracle databases on Amazon Elastic Compute Cloud (Amazon EC2) commonly choose Oracle Data Guard physical standby databases to help meet the HA and disaster recovery (DR) for their Oracle database workloads.

As discussed in this Oracle documentation, role-based services with multiple listener endpoints in the connection URL or tnsnames.ora entry is the preferred way to transparently connect to the database layer that is part of a Data Guard configuration. However, some application components and driver configurations don’t support multiple hostnames in the connection URL. Those applications require a single hostname or IP for the clients to connect to the Data Guard environment.

This post talks about the concept of using an Amazon Route 53 CNAME record in a Data Guard environment on EC2 and lists the artifacts to automatically route the connection between primary and standby environments in a Data Guard configuration based on the database role.

Solution overview

To help avoid the manual efforts to update DNS entries or tnsnames.ora file after a failover or switchover operation in a Data Guard environment, the solution uses an AFTER DB_ROLE_CHANGE trigger to automate the DNS failover process. This trigger runs a shell script on the database host, which in turn updates the CNAME record in Route 53 to point the CNAME records to reflect the role transition. The following diagram illustrates the solution architecture (Figure 1).

Figure 1. Solution architecture

Figure 1. Solution architecture

The solution discussed in this post covers routing new database connection requests to the right database post a Data Guard switchover activity. However, other factors such as application/client TTL settings and behavior of the connection pool to invalidate the connection handles created prior to the switchover activity can cause the application to connect to the database with a different role (like read-write workloads are connected to standby after switchover) and can generate errors, such as ORA-16000: database or pluggable database open for read-only access. It is a best practice to verify the database role before using the connection handles for transactions to verify that the application is connected to the database with the expected role.

The following workflow depicts the sequence of events that happens during a failover or switchover activity in a Data Guard environment to enable seamless connectivity for the application:

  1. A role transition event occurs in the Data Guard environment.
  2. The event triggers the AFTER DB_ROLE_CHANGE trigger.
  3. The trigger runs the shell script on the EC2 instance using a scheduler job.
  4. The shell script updates Route 53 to point the CNAME records to reflect the role transition.

Prerequisites

This post assumes the following prerequisites:

  • You should have an existing Data Guard configuration with one primary and one standby DB instance within a single VPC. Refer to the Oracle quick start template to deploy a Data Guard environment on Amazon EC2.
  • The steps discussed here are for self-managed Data Guard configuration on Amazon EC2 with Red Hat Linux AMI.
  • The scenario discussed in the post involves one primary and one standby database in the Data Guard configuration. For any other configurations, the scripts shown in this example require additional changes.
  • A private or public Route 53 hosted zone should be configured in the VPC where the DB environment exists.
  • The shell script uses the instance profile of the EC2 instance to run the AWS Command Line Interface (AWS CLI) commands. Make sure that the instance profile of the EC2 instances hosting the primary and standby databases has a policy attached that allows changing the record set in the hosted zone such as the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DBCnameFlipPloicy",
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets",
"route53:ListResourceRecordSets"
],
"Resource": "arn:aws:route53:::hostedzone/<<YourHostedZoneId>>"
}
]
}
  • Nslookup, jq, and curl utilities must be installed on all of the DB hosts. If not installed, you can install the utility on RHEL Linux using the following command:
yum install -y bind-utils
yum install -y curl
yum install -y jq

Environment details

This post assumes a Data Guard configuration with two instances within a single VPC, one primary and one standby, with the following details and naming conventions:

  • Oracle database version – 19.10 configured in maximum performance mode with Active Data Guard
  • Route 53 domain name – mydbdomain
  • Database name – orcl
  • DB_UNIQUE_NAME – orcl_a and orcl_b
  • Instance names – orcl
  • Route 53 A record for the host in AZ1 – orcl-a-db.mydbdomain
  • Route 53 A record for the host in AZ2 – orcl-b-db.mydbdomain

Route 53 configuration

Two A records are created in Route 53 to point to the IPs of the primary and standby hosts. Two CNAME records are also created in Route 53, which are automatically updated during the Data Guard switchover and failover scenarios. The CNAME record orcl-rw.mydbdomain points to the instance in the primary role that can accept read/write transactions, and orcl-ro.mydbdomain points to the instance in the standby role that accepts read-only queries.

The A records configuration is as follows:

  • DB host IP in AZ1 (10.0.0.5 in this example) – orcl-a-db.mydbdomain
  • DB host IP in AZ2 (10.0.32.5 in this example) – orcl-b-db.mydbdomain

The CNAME records configuration is as follows:

  • orcl-a-db.mydbdomain – orcl-rw.mydbdomain
  • orcl-b-db.mydbdomain – orcl-ro.mydbdomain

The following screenshot shows the Route 53 console view of the domain mydbdomain.

The Route 53 console view of the domain mydbdomain

Figure 2. The Route 53 console view of the domain mydbdomain

TNS configuration

The following tnsnames.ora file entries show how connections can be made to primary and standby databases using the CNAME records without a dependency on the actual IP address of the EC2 instances that host primary and standby databases. The entry orcl_a always points to the instance on orcl-a-db.mydbdomain, and orcl_b always points to the instance on orcl-b-db.mydbdomain, regardless of their roles. The entries orclrw and orclro direct the connection to the databases playing primary and standby roles, respectively.

orcl_a =
(description =
(address = (protocol = tcp)(host = orcl-a-db.mydbdomain)(port = 1525))
(connect_data =
(server = dedicated)
(service_name = orcl_a)
)
)

orcl_b =
(description =
(address = (protocol = tcp)(host = orcl-b-db.mydbdomain)(port = 1525))
(connect_data =
(server = dedicated)
(service_name = orcl_b)
)
)

orclrw =
(description =
(address = (protocol = tcp)(host = orcl-rw.mydbdomain)(port = 1525))
(connect_data =
(server = dedicated)
(service_name = orcl)
)
)

orclro =
(description =
(address = (protocol = tcp)(host = orcl-ro.mydbdomain)(port = 1525))
(connect_data =
(server = dedicated)
(service_name = orcl)
)
)

To enable connectivity using orclrw and orclro TNS entries, you can use either a role-based service or a static listener registration entry in both the primary and standby listener, as shown in the following code:

SID_DESC =
      (GLOBAL_DBNAME = orcl)
      (ORACLE_HOME = /opt/oracle/product/19c/dbhome_1)
      (SID_NAME = orcl)
    )

Implement the solution

To implement an automated DNS update during an Oracle switchover or failover, we use an Oracle database trigger and a shell script. The following are the high-level steps for the entire workflow:

  1. Create a DB_ROLE_CHANGE ON DATABASE trigger on the primary database
  2. The trigger in turn creates a DBMS job that calls a shell script with the cname_switch.sh.
  3. The shell script updates the Route 53 CNAME entries.

Database trigger

Use the following code for the database trigger:

CREATE OR REPLACE TRIGGER sys.cname_flip_post_role_change 
AFTER DB_ROLE_CHANGE ON DATABASE
DECLARE
  v_db_name VARCHAR2(9);
  v_db_role VARCHAR2(16);
BEGIN
  SELECT DATABASE_ROLE  INTO v_db_role FROM V$DATABASE;
  SELECT DB_UNIQUE_NAME INTO v_db_name FROM V$DATABASE;

  IF v_db_role = 'PRIMARY' THEN
    BEGIN
      dbms_scheduler.drop_job('RW_CNAME_FLIP');
    EXCEPTION
      WHEN OTHERS THEN NULL;
    END;

    dbms_scheduler.create_job(
      job_name   => 'RW_CNAME_FLIP',
      job_type   => 'EXECUTABLE',
      number_of_arguments => 1,
      job_action => '/home/oracle/admin/bin/cname_switch.sh',
      enabled    => false,
      auto_drop  => true);

    dbms_scheduler.set_job_argument_value(
      job_name          => 'RW_CNAME_FLIP',
      argument_position => 1,
      argument_value    => v_db_name);

    BEGIN
      dbms_scheduler.run_job('RW_CNAME_FLIP');
    EXCEPTION
    WHEN OTHERS THEN
      raise_application_error(-20101, 'CNAME flip failed, check script error');
    END;

  END IF;

EXCEPTION
  WHEN OTHERS THEN
    raise_application_error(-20102, 'CNAME flip failed due to error: ' || SQLERR
M);
END;
/

Shell script

This script determines the current CNAME, identifies the dependent A records, and maps the CNAME to the correct A records accordingly. This shell script is provided for reference assuming the naming conventions for db_name and db_unique_name as used in the sample configuration. You should review and modify the script to meet your specific requirements and organization standards.

As per the example shown earlier, the shell script is placed in the location /home/oracle/admin/bin/cname_switch.sh.

Note: it’s common to see production databases that are restored or cloned to lower environments.

If the script is run in those environments, it can potentially change the CNAME entries unexpectedly. To mitigate this, the shell script has the function restore_safeguard. This function checks that the IP assigned to the EC2 instance is actually matching with the A records configured for this database in Route 53. If no match is found, this will not perform CNAME failover.

#! /bin/bash
#set -x
​
# Variables may need to be changed to suit your environment
​
DB_NAME=$1
DB_IN=$1
echo "Orginal Input : ${DB_NAME}"
DB_NAME=`echo "${DB_NAME::-2}"`  # removing last 2 characters from DB_UNIQUE_NAME
DB_NAME=`echo "${DB_NAME}" | tr '[:upper:]' '[:lower:]'`
echo "Modified Input : ${DB_NAME}"
​
DB_DOMAIN=<<YOUR_AWS_ROUTE53_DOMAIN_NAME>>    # Update as per your AWS Route53 domian name
ZONE_ID=<<YOUR_AWS_ROUTE53_HOSTED_ZONE_ID>>   # Update as per your AWS Route53 hosted zone ID
EC2_METADATA='http://169.254.169.254/latest/dynamic/instance-identity/document'
​
# CNAME and A-Records related varables :
​
RW_CNAME=`echo "${DB_NAME}-rw.${DB_DOMAIN}"`
RO_CNAME=`echo "${DB_NAME}-ro.${DB_DOMAIN}"`
A_CNAME=`echo "${DB_NAME}-a-db.${DB_DOMAIN}"`
B_CNAME=`echo "${DB_NAME}-b-db.${DB_DOMAIN}"`
​
REGION=`curl -s ${EC2_METADATA}|grep region|awk -F\" '{print $4}'`
​
# Logfile configuration and file initilization
​
TS=`date +%Y%m%d_%H%M%S`
LOG_DIR=/tmp
CHANGE_SET_FILE=`echo "${LOG_DIR}/${DB_NAME}-CnameFlip-${TS}.json"`
LOG_FILE=`echo "${LOG_DIR}/${DB_NAME}-CnameFlip-${TS}.log"`
CONF_FILE=`echo "file://${CHANGE_SET_FILE}"`
​
# Function to check if current host IP matching with Route 53 configuration
​
IS_SAFE='Unsafe'
​
function restore_safeguard()
{
    AWS_TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
    LOCAL_IPV4=`curl -sH "X-aws-ec2-metadata-token: $AWS_TOKEN" -v http://169.254.169.254/latest/meta-data/local-ipv4`
    PUBLIC_IPV4=`curl -sH "X-aws-ec2-metadata-token: $AWS_TOKEN" -v http://169.254.169.254/latest/meta-data/public-ipv4`
    NOT_FOUND=`echo ${PUBLIC_IPV4} | grep '404 - Not Found' | wc -l`
​
    if [ ${NOT_FOUND} == 1 ]; then
       PUBLIC_IPV4='No Public IP Assigned'
    fi
​
    A_IP=$(aws route53 list-resource-record-sets --hosted-zone-id ${ZONE_ID} \
           --query 'ResourceRecordSets[?Type==`A`].{Name: Name, Value:ResourceRecords[0].Value}' | \
           jq -cr --arg DB_NAME "${DB_NAME}-a" '.[] | select( .Name | contains($DB_NAME)).Value')
​
    B_IP=$(aws route53 list-resource-record-sets --hosted-zone-id ${ZONE_ID} \
           --query 'ResourceRecordSets[?Type==`A`].{Name: Name, Value:ResourceRecords[0].Value}' | \
           jq -cr --arg DB_NAME "${DB_NAME}-b" '.[] | select( .Name | contains($DB_NAME)).Value')
​
    PREVIOUS_RW_ID=$(aws route53 list-resource-record-sets --hosted-zone-id ${ZONE_ID} \
           --query 'ResourceRecordSets[?Type==`CNAME`].{Name: Name, Value:ResourceRecords[0].Value}' | \
           jq -cr --arg DB_NAME "${DB_NAME}-rw" '.[] | select( .Name | contains($DB_NAME)).Value' | cut -d'-' -f2)
​
    if [ ${PREVIOUS_RW_ID} == 'a' ]; then
       RW_NODE_IP=${A_IP}
       RO_NODE_IP=${B_IP}
    else
       RW_NODE_IP=${B_IP}
       RO_NODE_IP=${A_IP}
    fi
​
    # Looging Input values
​
    echo "Orginal Input   : ${DB_IN}"          | tee -a ${LOG_FILE}
    echo "Modified Input  : ${DB_NAME}"        | tee -a ${LOG_FILE}
    echo "Current RW ID   : ${PREVIOUS_RW_ID}" | tee -a ${LOG_FILE}
    echo "Host Private IP : ${LOCAL_IPV4}"     | tee -a ${LOG_FILE}
    echo "Host Public IP  : ${PUBLIC_IPV4}"    | tee -a ${LOG_FILE}
    echo "A Node IP       : ${A_IP}"           | tee -a ${LOG_FILE}
    echo "A Node IP       : ${B_IP}"           | tee -a ${LOG_FILE}
    echo "RW Node IP      : ${RW_NODE_IP}"     | tee -a ${LOG_FILE}
    echo "RO Node IP      : ${RO_NODE_IP}"     | tee -a ${LOG_FILE}
​
    if [ "${LOCAL_IPV4}" == "${RO_NODE_IP}" -o "${PUBLIC_IPV4}" == "${RO_NODE_IP}" ]; then
       IS_SAFE='Safe'
    else
       IS_SAFE='Unsafe'
    fi
}
​
restore_safeguard
​
if [ ${IS_SAFE} == 'Safe' ]; then
   echo "Safe for CNAME faliover..." | tee -a ${LOG_FILE}
else
   echo "Unsafe for CNAME faliover..." | tee -a ${LOG_FILE}
   echo "Aborting..."
   exit 1
fi
​
PRI_DB_ID=`nslookup ${RW_CNAME}|grep "canonical name"|cut -d'=' -f2|cut -d'-' -f2`
​
# Looging Input values :
echo "Orginal Input      : ${DB_IN}"     | tee    ${LOG_FILE}
echo "Modified Input     : ${DB_NAME}"   | tee -a ${LOG_FILE}
echo "Current RW host ID : ${PRI_DB_ID}" | tee -a ${LOG_FILE}
​
echo -e "\nChange to be done : \n" | tee -a ${LOG_FILE}
​
if [ ${PRI_DB_ID} == 'a' ]; then
   echo "Changing ${RW_CNAME} from ${A_CNAME} to ${B_CNAME}" | tee -a ${LOG_FILE}
   echo "Changing ${RO_CNAME} from ${B_CNAME} to ${A_CNAME}" | tee -a ${LOG_FILE}
   TO_BE_RW_CNAME=${B_CNAME}
   TO_BE_RO_CNAME=${A_CNAME}
else
   echo "Changing ${RW_CNAME} from ${B_CNAME} to ${A_CNAME}" | tee -a ${LOG_FILE}
   echo "Changing ${RO_CNAME} from ${A_CNAME} to ${B_CNAME}" | tee -a ${LOG_FILE}
   TO_BE_RW_CNAME=${A_CNAME}
   TO_BE_RO_CNAME=${B_CNAME}
fi
​
R53_CHANGE=`echo -e "
{
  \"Comment\": \"Flip CNAMEs\",
  \"Changes\": [
    {
      \"Action\" : \"UPSERT\",
      \"ResourceRecordSet\" : {
        \"Name\" : \"${RW_CNAME}.\",
        \"Type\" : \"CNAME\",
        \"TTL\"  : 60,
        \"ResourceRecords\" : [{ \"Value\": \"${TO_BE_RW_CNAME}.\" }]
      }
    },
    {
      \"Action\" : \"UPSERT\",
      \"ResourceRecordSet\" : {
        \"Name\" : \"${RO_CNAME}\",
        \"Type\" : \"CNAME\",
        \"TTL\"  : 60,
        \"ResourceRecords\" : [{ \"Value\": \"${TO_BE_RO_CNAME}.\" }]
      }
    }
  ]
}
"`
​
echo -e "\nRoute53 Change Set :\n" | tee -a ${LOG_FILE}
echo ${R53_CHANGE} | tee -a ${LOG_FILE}
echo ${R53_CHANGE} > ${CHANGE_SET_FILE}
​
echo -e "\nCommand to Execute : " | tee -a ${LOG_FILE}
echo -e "\naws route53 change-resource-record-sets --hosted-zone-id ${ZONE_ID} \
         --change-batch ${CONF_FILE} \n" | tee -a ${LOG_FILE}
​
echo -e "\nExecution Result :\n"
aws route53 change-resource-record-sets --hosted-zone-id ${ZONE_ID} \
--change-batch ${CONF_FILE} | tee -a ${LOG_FILE}
​
echo -e "\nAfter Change :\n "
aws route53 list-resource-record-sets --hosted-zone-id ${ZONE_ID} | tee -a ${LOG_FILE}

Test the solution

The following screenshot shows the Route 53 console view of the domain mydbdomain before the switchover. The primary database is running on orcl-a-db.mydomain because orcl-rw.mydomain is pointing to that.

Route 53 console view of the domain mydbdomain before the switchover

Figure 3. Route 53 console view of the domain mydbdomain before the switchover

The following SQL displays the current role of both primary and standby databases and host_name they are currently running on.

[oracle@ip-10-0-0-5 sql]$ cat db_info.sql

ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD:HH24:MI';
set lines 150 pages 200
col HOST_NAME for a30 trunc

select d.NAME, d.db_unique_name, d.DATABASE_ROLE, d.OPEN_MODE, i.INSTANCE_NAME, 
i.HOST_NAME, i.STARTUP_TIME
from v$instance i, v$database d;

[oracle@ip-10-0-0-5 sql]$ sqlplus system@orclrw

SQL> @db_info

NAME  DB_UNIQUE_NAME DATABASE_ROLE OPEN_MODE INSTANCE_NAME HOST_NAME STARTUP_TIME
------ ---------------- -------------- ---------------- ------------------------------ ----------------
ORCL orcl_a PRIMARY READ WRITE orcl ip-10-0-0-5.us-west-2.compute. 2020-05-24:01:47

[oracle@ip-10-0-0-5 sql]$ sqlplus system@orclro

SQL> @db_info

NAME DB_UNIQUE_NAME DATABASE_ROLE OPEN_MODE INSTANCE_NAME HOST_NAME STARTUP_TIME
------ ---------------- -------------------- -------------- ------------------------------- ----------------
ORCL orcl_b PHYSICAL STANDBY READ ONLY WITH APPLY orcl ip-10-0-32-5.us-west-2.compute. 2020-05-24:05:50

Let’s initiate the switchover:

[oracle@ip-10-0-0-5 sql]$ dgmgrl /
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Wed May 27 06:42:51 2020

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "orcl_a"
Connected as SYSDG.
DGMGRL> show configuration;

Configuration - awsguard

  Protection Mode: MaxPerformance
  Members:
  orcl_a - Primary database
    orcl_b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 39 seconds ago)

DGMGRL> switchover to orcl_b;
Performing switchover NOW, please wait...
Operation requires a connection to database "orcl_b"
Connecting ...
Connected to "orcl_b"
Connected as SYSDBA.
New primary database "orcl_b" is opening...
Oracle Clusterware is restarting database "orcl_a" ...
Switchover succeeded, new primary is "orcl_b"
DGMGRL>
DGMGRL> show configuration;

Configuration - awsguard

  Protection Mode: MaxPerformance
  Members:
  orcl_b - Primary database
    orcl_a - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 67 seconds ago)

DGMGRL>

Now that the switchover is complete, let’s connect to the database using the orclrw and orclro TNS entries using the following code:

[oracle@ip-10-0-0-5 sql]$ sqlplus system@orclrw

SQL> @db_info

NAME DB_UNIQUE_NAME  DATABASE_ROLE  OPEN_MODE     INSTANCE_NAME  HOST_NAME                      STARTUP_TIME
----- -------------- ------------- -------------- ------------------------------ ----------------
ORCL  orcl_b PRIMARY        READ WRITE    orcl          ip-10-0-32-5.us-west-2.compute 2020-05-24:05:50


[oracle@ip-10-0-0-5 sql]$ sqlplus system@orclro

SQL> @db_info

NAME  DATABASE_ROLE     OPEN_MODE            INSTANCE_NAME  HOST_NAME            STARTUP_TIME
----- ----------------- -------------------- -------------- ------------------------------ ----------------
ORCL orcl_a PHYSICAL STANDBY  READ ONLY WITH APPLY orcl          ip-10-0-0-5.us-west-2.compute. 2020-05-27:06:43

The following screenshot shows the Route 53 console view of the domain mydbdomain after the switchover. The primary database is now running on orcl-b-db.mydomain because orcl-rw.mydomain is pointing to that.

Route 53 console view of the domain mydbdomain after the switchover

Figure 4. Route 53 console view of the domain mydbdomain after the switchover

Conclusion

Application connectivity to a Data Guard environment can be challenging, especially when the application configuration doesn’t support multiple hostnames or listener endpoints. In this post, we discussed step-by-step details to enable seamless connectivity to Data Guard environments using Route 53 CNAME records, a database trigger, and a shell script. You can use these artifacts to direct the DB connections to the database with the right role seamlessly without application changes. If you are using Data Guard Observer for automated failover, another blog, Setup a high availability design for Oracle Data Guard (Fast-Start Failover) using Amazon Route 53 discusses an alternate mechanism to achieve the same result.

New – Amazon Aurora I/O-Optimized Cluster Configuration with Up to 40% Cost Savings for I/O-Intensive Applications

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-aurora-i-o-optimized-cluster-configuration-with-up-to-40-cost-savings-for-i-o-intensive-applications/

Since Amazon Aurora launched in 2014, hundreds of thousands of customers have chosen Aurora to run their most demanding applications. Aurora provides unparalleled high performance and availability at global scale with full MySQL and PostgreSQL compatibility at up to one-tenth the cost of commercial databases.

Many customers benefit from the cost-effectiveness of Aurora’s current simple, pay-per-request pricing for input/output (I/O) usage, removing the need to provision I/Os in advance. Customers also benefit from additional cost-saving innovations such as Amazon Aurora Serverless v2 (ASv2), which provides seamless scaling in fine-grained increments based on the application’s demands. For workloads with spikes in demand, you can save up to 90 percent in costs vs. provisioning capacity for peak load with ASv2.

Today, we are announcing the general availability of Amazon Aurora I/O-Optimized, a new cluster configuration that offers improved price performance and predictable pricing for customers with I/O-intensive applications, such as e-commerce applications, payment processing systems, and more. Aurora I/O-Optimized offers improved performance, increasing throughput and reducing latency to support your most demanding workloads.

You can now confidently predict costs for your most I/O-intensive workloads, with up to 40 percent cost savings when your I/O spend exceeds 25 percent of your current Aurora database spend. If you are using Reserved Instances, you will see even greater cost savings.

Now you have the flexibility to choose between the existing configuration newly called Aurora Standard, which is the existing pay-per-request pricing model that is cost-effective for applications with low-to-moderate I/O usage or the new Aurora I/O-Optimized configuration for I/O-intensive applications.

Getting Started with Aurora I/O-Optimized
You can create a new database cluster using the Aurora I/O-Optimized configuration or convert your existing database clusters with a few clicks in the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs.

For the Aurora MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Edition, you can choose either the Aurora Standard or Aurora I/O-Optimized configuration.

Aurora I/O-Optimized configuration is available in the latest version of Aurora MySQL version 3.03.1 and higher, Aurora PostgreSQL v15.2 and higher, v14.7 and higher, and v13.10 and higher.

This configuration supports Intel-based Aurora database instance types such as t3, r5, and r6i, Graviton-based database instance types such as t4g, r7g, and x2g, Aurora Serverless v2, Aurora Global Database, on-demand Aurora database instances, and reserved instances.

R7g instances for Amazon Aurora are powered by the latest generation AWS Graviton3 processors, delivering up to 30 percent performance gains and up to 20 percent improved price performance for Aurora, as compared to R6g instances.

In your existing Aurora clusters, you can switch the storage configuration to Aurora I/O-Optimized once every 30 days or switch back to Aurora Standard at any time. You can change the cluster storage configuration only at the cluster level. The change applies to all instances in the cluster.

After changing the configuration, you don’t need to reboot the database instances within the cluster to take advantage of the price-performance benefits of Aurora I/O-Optimized.

Now Available
Amazon Aurora I/O-Optimized configuration is now generally available for Amazon Aurora MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Edition in most AWS Regions where Aurora is available, with China (Beijing), China (Ningxia), AWS GovCloud (US-East), and AWS GovCloud (US-West) Regions coming soon.

Aurora is billed differently for the two configurations: Aurora Standard or Aurora I/O-Optimized. The latter doesn’t charge for I/Os, charging a set price for compute and storage relative to the former. For I/O-intensive applications, its price/performance will be better, and you can save up to 40 percent on costs. To see pricing examples, visit the Aurora Pricing page.

To learn more, read Amazon Aurora storage and reliability in the AWS documentation. Give it a try, and please send feedback to AWS re:Post for Amazon Aurora or through your usual AWS support contacts.

Channy

AWS Week in Review – March 20, 2023

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-march-20-2023/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A new week starts, and Spring is almost here! If you’re curious about AWS news from the previous seven days, I got you covered.

Last Week’s Launches
Here are the launches that got my attention last week:

Picture of an S3 bucket and AWS CEO Adam Selipsky.Amazon S3 – Last week there was AWS Pi Day 2023 celebrating 17 years of innovation since Amazon S3 was introduced on March 14, 2006. For the occasion, the team released many new capabilities:

Amazon Linux 2023 – Our new Linux-based operating system is now generally available. Sébastien’s post is full of tips and info.

Application Auto Scaling – Now can use arithmetic operations and mathematical functions to customize the metrics used with Target Tracking policies. You can use it to scale based on your own application-specific metrics. Read how it works with Amazon ECS services.

AWS Data Exchange for Amazon S3 is now generally available – You can now share and find data files directly from S3 buckets, without the need to create or manage copies of the data.

Amazon Neptune – Now offers a graph summary API to help understand important metadata about property graphs (PG) and resource description framework (RDF) graphs. Neptune added support for Slow Query Logs to help identify queries that need performance tuning.

Amazon OpenSearch Service – The team introduced security analytics that provides new threat monitoring, detection, and alerting features. The service now supports OpenSearch version 2.5 that adds several new features such as support for Point in Time Search and improvements to observability and geospatial functionality.

AWS Lake Formation and Apache Hive on Amazon EMR – Introduced fine-grained access controls that allow data administrators to define and enforce fine-grained table and column level security for customers accessing data via Apache Hive running on Amazon EMR.

Amazon EC2 M1 Mac Instances – You can now update guest environments to a specific or the latest macOS version without having to tear down and recreate the existing macOS environments.

AWS Chatbot – Now Integrates With Microsoft Teams to simplify the way you troubleshoot and operate your AWS resources.

Amazon GuardDuty RDS Protection for Amazon Aurora – Now generally available to help profile and monitor access activity to Aurora databases in your AWS account without impacting database performance

AWS Database Migration Service – Now supports validation to ensure that data is migrated accurately to S3 and can now generate an AWS Glue Data Catalog when migrating to S3.

AWS Backup – You can now back up and restore virtual machines running on VMware vSphere 8 and with multiple vNICs.

Amazon Kendra – There are new connectors to index documents and search for information across these new content: Confluence Server, Confluence Cloud, Microsoft SharePoint OnPrem, Microsoft SharePoint Cloud. This post shows how to use the Amazon Kendra connector for Microsoft Teams.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
A few more blog posts you might have missed:

Example of a geospatial query.Women founders Q&A – We’re talking to six women founders and leaders about how they’re making impacts in their communities, industries, and beyond.

What you missed at that 2023 IMAGINE: Nonprofit conference – Where hundreds of nonprofit leaders, technologists, and innovators gathered to learn and share how AWS can drive a positive impact for people and the planet.

Monitoring load balancers using Amazon CloudWatch anomaly detection alarms – The metrics emitted by load balancers provide crucial and unique insight into service health, service performance, and end-to-end network performance.

Extend geospatial queries in Amazon Athena with user-defined functions (UDFs) and AWS Lambda – Using a solution based on Uber’s Hexagonal Hierarchical Spatial Index (H3) to divide the globe into equally-sized hexagons.

How cities can use transport data to reduce pollution and increase safety – A guest post by Rikesh Shah, outgoing head of open innovation at Transport for London.

For AWS open-source news and updates, here’s the latest newsletter curated by Ricardo to bring you the most recent updates on open-source projects, posts, events, and more.

Upcoming AWS Events
Here are some opportunities to meet:

AWS Public Sector Day 2023 (March 21, London, UK) – An event dedicated to helping public sector organizations use technology to achieve more with less through the current challenging conditions.

Women in Tech at Skills Center Arlington (March 23, VA, USA) – Let’s celebrate the history and legacy of women in tech.

The AWS Summits season is warming up! You can sign up here to know when registration opens in your area.

That’s all from me for this week. Come back next Monday for another Week in Review!

Danilo

New for Amazon Redshift – Simplify Data Ingestion and Make Your Data Warehouse More Secure and Reliable

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-redshift-simplify-data-ingestion-and-make-your-data-warehouse-more-secure-and-reliable/

When we talk with customers, we hear that they want to be able to harness insights from data in order to make timely, impactful, and actionable business decisions. A common pattern with data-driven organizations is that they have many different data sources they need to ingest into their analytics systems. This requires them to build manual data pipelines spanning across their operational databases, data lakes, streaming data, and data within their warehouse. As a consequence of this complex setup, it can take data engineers weeks or even months to build data ingestion pipelines. These data pipelines are costly, and the delays can lead to missed business opportunities. Additionally, data warehouses are increasingly becoming mission critical systems that require high availability, reliability, and security.

Amazon Redshift is a fully managed petabyte-scale data warehouse used by tens of thousands of customers to easily, quickly, securely, and cost-effectively analyze all their data at any scale. This year at re:Invent, Amazon Redshift has announced a number of features to help you simplify data ingestion and get to insights easily and quickly, within a secure, reliable environment.

In this blog, I introduce some of these new features that fit into two main categories:

  • Simplify data ingestion
    • Amazon Redshift now supports auto-copy from Amazon S3 (available in preview). With this new capability, Amazon Redshift automatically loads the files that arrive in an Amazon Simple Storage Service (Amazon S3) location that you specify into your data warehouse. The files can use any of the formats supported by the Amazon Redshift copy command, such as CSV, JSON, Parquet, and Avro. In this way, you don’t need to manually or repeatedly run copy procedures. Amazon Redshift automates file ingestion and takes care of data-loading steps under the hood.
    • With Amazon Aurora zero-ETL integration with Amazon Redshift, you can use Amazon Redshift for near real-time analytics and machine learning on petabytes of transactional data stored on Amazon Aurora MySQL databases (available in limited preview). With this capability, you can choose the Amazon Aurora databases containing the data you want to analyze with Amazon Redshift. Data is then replicated into your data warehouse within seconds after transactional data is written into Amazon Aurora, eliminating the need to build and maintain complex data pipelines. You can replicate data from multiple Amazon Aurora databases into the same Amazon Redshift instance to run analytics across multiple applications. With near real-time access to transactional data, you can leverage Amazon Redshift’s analytics and capabilities, such as built-in machine learning (ML), materialized views, data sharing, and federated access to multiple data stores and data lakes, to derive insights from transactional and other data.
    • With the general availability of Amazon Redshift Streaming Ingestion, you can now natively ingest hundreds of megabytes of data per second from Amazon Kinesis Data Streams and Amazon MSK into an Amazon Redshift materialized view and query it in seconds. Learn more in this post.
  • Make your data warehouse more secure and reliable
    • You can now improve the availability of your data warehouse by choosing multiple Availability Zone (AZ) deployments. Multi-AZ deployments for your Amazon Redshift clusters are available in preview and reduce recovery times to seconds through automatic recovery. In this way, you can build solutions that are more compliant with the recommendations of the Reliability Pillar of the AWS Well-Architected Framework.
    • With dynamic data masking (available in preview), you can protect sensitive information stored in your data warehouse and ensure that only the relevant data is accessible by users based on their roles. You can limit how much identifiable data is visible to users using multiple levels of policies so different users and groups can have different levels of data access without having to create multiple copies of data. Dynamic data masking complements other granular access control capabilities in Amazon Redshift including row-level and column-level security and role-based access controls. In this way, Dynamic Data Masking helps you meet requirements for GDPR, CCPA, and other privacy regulations.
    • Amazon Redshift now supports central access controls for data sharing with AWS Lake Formation (available in public preview). You can now use Lake Formation to simplify governance of data shared from Amazon Redshift and centrally manage granular access across all data-sharing consumers.

There have been other interesting news for Amazon Redshift at re:Invent you might have already heard about:

  • The general availability of Amazon Redshift integration for Apache Spark makes it easy to build and run Spark applications on Amazon Redshift and Redshift Serverless, opening up the data warehouse for a broader set of AWS analytics and machine learning solutions.
  • AWS Backup now supports Amazon Redshift. AWS Backup allows you to define a central backup policy to manage data protection of your applications and can also protect your Amazon Redshift clusters. In this way, you have a consistent experience when managing data protection across all supported services.

Availability and Pricing
Multi-AZ deployments, central access control for data sharing with AWS Lake Formation, auto-copy from Amazon S3, and dynamic data masking are available in preview in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Europe (Ireland), and Europe (Stockholm).

There is no additional cost for using auto-copy from Amazon S3 and near real-time analytics on transactional data. There is no extra charge for dynamic data masking and central access control for data sharing. For more information, see Amazon Redshift pricing.

These new capabilities take you one step further in analyzing all your data across data sources with simple data ingestion capabilities, while improving the security and reliability of your data warehouse.

Danilo

Announcing Amazon DocumentDB Elastic Clusters

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/announcing-amazon-documentdb-elastic-clusters/

Amazon DocumentDB (with MongoDB compatibility) is a scalable, highly durable, and fully managed database service for operating mission-critical JSON workloads. It is one of AWS fast-growing services with customers including BBC, Dow Jones, and Samsung relying on Amazon DocumentDB to run their JSON workloads at scale.

Today I am excited to announce the general availability of Amazon DocumentDB Elastic Clusters. Elastic Clusters enables you to elastically scale your document database to handle virtually any number of writes and reads, with petabytes of storage capacity. Elastic Clusters simplifies how customers interact with Amazon DocumentDB by automatically managing the underlying infrastructure and removing the need to create, remove, upgrade, or scale instances.

A Few Concepts about Elastic Clusters
Sharding – A popular database concept also known as partitioning, sharding splits large data sets into smaller data sets across multiple nodes enabling customers to scale out their database beyond vertical scaling limits. Elastic Clusters uses sharding to partition data across Amazon DocumentDB’s distributed storage system. 

Elastic Clusters – Elastic Clusters is Amazon DocumentDB clusters that allow you to scale your workload’s throughput to millions of writes/reads per second and storage to petabytes. Elastic Clusters comprises one or more shards each of which has its own compute and storage volume. It is highly available across three Availability Zones (AZs) by default, with six copies of your data replicated across these three AZs. You can create Elastic Clusters using the Amazon DocumentDB API, AWS SDK, AWS CLI, AWS CloudFormation, or the AWS console.

Scale Workloads with Little to No Impact – With Elastic Clusters, your database can scale to millions of operations with little to no downtime or performance impact.

Integration with Other AWS Services – Elastic Clusters integrates with other AWS services in the same way Amazon DocumentDB does today. First, you can monitor the health and performance of your Elastic Clusters using Amazon CloudWatch. Second, you can set up authentication and authorization for resources such as clusters through AWS Identity and Access Management (IAM) users and roles and use Amazon Virtual Private Cloud (Amazon VPC) for secure VPC-only connections. Last, you can use AWS Glue to import and export data from and to other AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon OpenSearch Service.

Getting Started with Elastic Clusters
Previously, I mentioned that you can use either the AWS console, AWS CLI, or AWS SDK to create Elastic Clusters. In the examples below, we will look at how you can create a cluster, scale up or out, and scale in or down using the AWS CLI:

Create a Cluster
When creating a cluster, you will specify the vCPUs that you want for your Elastic Clusters at provisioning. With the size of vCPUs that you provision, you will also get a proportionate amount of memory, expressed in vCPUs. Elastic Clusters automatically provisions the necessary infrastructure (shards and instances) on your behalf.
aws docdb-elastic create-cluster
--cluster-name foo
--shard-capacity 2
--shard-count 4
--auth-type PLAIN_TEXT
--admin-user-name docdbelasticadmin
--admin-user-password password

Scale Up or Out
If you need more compute and storage to handle an increase in traffic, modify the shard-count parameter. Elastic Clusters scales the underlying infrastructure up or out to give you additional compute and storage capacity.
aws docdb-elastic update-cluster
--cluster-arn foo-arn
--shard-count 8

Scale In or Down
If you no longer need the compute and storage that you currently have provisioned, either due to a decline in database traffic or the fact that you originally over-provisioned, modify the shard-count parameter. Elastic Clusters scales the underlying infrastructure in or down.
aws docdb-elastic update-cluster
--cluster-arn foo-arn
--shard-count 4

General Availability of Elastic Clusters for Amazon DocumentDB
Amazon DocumentDB Elastic Clusters is now available in all AWS Regions where Amazon DocumentDB is available, except China and AWS GovCloud. To learn more, visit the Amazon DocumentDB page.

Veliswa x

New for Amazon Redshift – General Availability of Streaming Ingestion for Kinesis Data Streams and Managed Streaming for Apache Kafka

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-redshift-general-availability-of-streaming-ingestion-for-kinesis-data-streams-and-managed-streaming-for-apache-kafka/

Ten years ago, just a few months after I joined AWS, Amazon Redshift was launched. Over the years, many features have been added to improve performance and make it easier to use. Amazon Redshift now allows you to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes. More recently, Amazon Redshift Serverless became generally available to make it easier to run and scale analytics without having to manage your data warehouse infrastructure.

To process data as quickly as possible from real-time applications, customers are adopting streaming engines like Amazon Kinesis and Amazon Managed Streaming for Apache Kafka. Previously, to load streaming data into your Amazon Redshift database, you’d have to configure a process to stage data in Amazon Simple Storage Service (Amazon S3) before loading. Doing so would introduce a latency of one minute or more, depending on the volume of data.

Today, I am happy to share the general availability of Amazon Redshift Streaming Ingestion. With this new capability, Amazon Redshift can natively ingest hundreds of megabytes of data per second from Amazon Kinesis Data Streams and Amazon MSK into an Amazon Redshift materialized view and query it in seconds.

Architecture diagram.

Streaming ingestion benefits from the ability to optimize query performance with materialized views and allows the use of Amazon Redshift more efficiently for operational analytics and as the data source for real-time dashboards. Another interesting use case for streaming ingestion is analyzing real-time data from gamers to optimize their gaming experience. This new integration also makes it easier to implement analytics for IoT devices, clickstream analysis, application monitoring, fraud detection, and live leaderboards.

Let’s see how this works in practice.

Configuring Amazon Redshift Streaming Ingestion
Apart from managing permissions, Amazon Redshift streaming ingestion can be configured entirely with SQL within Amazon Redshift. This is especially useful for business users who lack access to the AWS Management Console or the expertise to configure integrations between AWS services.

You can set up streaming ingestion in three steps:

  1. Create or update an AWS Identity and Access Management (IAM) role to allow access to the streaming platform you use (Kinesis Data Streams or Amazon MSK). Note that the IAM role should have a trust policy that allows Amazon Redshift to assume the role.
  2. Create an external schema to connect to the streaming service.
  3. Create a materialized view that references the streaming object (Kinesis data stream or Kafka topic) in the external schemas.

After that, you can query the materialized view to use the data from the stream in your analytics workloads. Streaming ingestion works with Amazon Redshift provisioned clusters and with the new serverless option. To maximize simplicity, I am going to use Amazon Redshift Serverless in this walkthrough.

To prepare my environment, I need a Kinesis data stream. In the Kinesis console, I choose Data streams in the navigation pane and then Create data stream. For the Data stream name, I use my-input-stream and then leave all other options set to their default value. After a few seconds, the Kinesis data stream is ready. Note that by default I am using on-demand capacity mode. In a development or test environment, you can choose provisioned capacity mode with one shard to optimize costs.

Now, I create an IAM role to give Amazon Redshift access to the my-input-stream Kinesis data streams. In the IAM console, I create a role with this policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kinesis:DescribeStreamSummary",
                "kinesis:GetShardIterator",
                "kinesis:GetRecords",
                "kinesis:DescribeStream"
            ],
            "Resource": "arn:aws:kinesis:*:123412341234:stream/my-input-stream"
        },
        {
            "Effect": "Allow",
            "Action": [
                "kinesis:ListStreams",
                "kinesis:ListShards"
            ],
            "Resource": "*"
        }
    ]
}

To allow Amazon Redshift to assume the role, I use the following trust policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

In the Amazon Redshift console, I choose Redshift serverless from the navigation pane and create a new workgroup and namespace, similar to what I did in this blog post. When I create the namespace, in the Permissions section, I choose Associate IAM roles from the dropdown menu. Then, I select the role I just created. Note that the role is visible in this selection only if the trust policy allows Amazon Redshift to assume it. After that, I complete the creation of the namespace using the default options. After a few minutes, the serverless database is ready for use.

In the Amazon Redshift console, I choose Query editor v2 in the navigation pane. I connect to the new serverless database by choosing it from the list of resources. Now, I can use SQL to configure streaming ingestion. First, I create an external schema that maps to the streaming service. Because I am going to use simulated IoT data as an example, I call the external schema sensors.

CREATE EXTERNAL SCHEMA sensors
FROM KINESIS
IAM_ROLE 'arn:aws:iam::123412341234:role/redshift-streaming-ingestion';

To access the data in the stream, I create a materialized view that selects data from the stream. In general, materialized views contain a precomputed result set based on the result of a query. In this case, the query is reading from the stream, and Amazon Redshift is the consumer of the stream.

Because streaming data is going to be ingested as JSON data, I have two options:

  1. Leave all the JSON data in a single column and use Amazon Redshift capabilities to query semi-structured data.
  2. Extract JSON properties into their own separate columns.

Let’s see the pros and cons of both options.

The approximate_arrival_timestamp, partition_key, shard_id, and sequence_number columns in the SELECT statement are provided by Kinesis Data Streams. The record from the stream is in the kinesis_data column. The refresh_time column is provided by Amazon Redshift.

To leave the JSON data in a single column of the sensor_data materialized view, I use the JSON_PARSE function:

CREATE MATERIALIZED VIEW sensor_data AUTO REFRESH YES AS
    SELECT approximate_arrival_timestamp,
           partition_key,
           shard_id,
           sequence_number,
           refresh_time,
           JSON_PARSE(kinesis_data, 'utf-8') as payload    
      FROM sensors."my-input-stream";
CREATE MATERIALIZED VIEW sensor_data AUTO REFRESH YES AS
SELECT approximate_arrival_timestamp,
partition_key,
shard_id,
sequence_number,
refresh_time,
JSON_PARSE(kinesis_data) as payload 
FROM sensors."my-input-stream";

Because I used the AUTO REFRESH YES parameter, the content of the materialized view is automatically refreshed when there is new data in the stream.

To extract the JSON properties into separate columns of the sensor_data_extract materialized view, I use the JSON_EXTRACT_PATH_TEXT function:

CREATE MATERIALIZED VIEW sensor_data_extract AUTO REFRESH YES AS
    SELECT approximate_arrival_timestamp,
           partition_key,
           shard_id,
           sequence_number,
           refresh_time,
           JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kinesis_data, 'utf-8'),'sensor_id')::VARCHAR(8) as sensor_id,
           JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kinesis_data, 'utf-8'),'current_temperature')::DECIMAL(10,2) as current_temperature,
           JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kinesis_data, 'utf-8'),'status')::VARCHAR(8) as status,
           JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kinesis_data, 'utf-8'),'event_time')::CHARACTER(26) as event_time
      FROM sensors."my-input-stream";

Loading Data into the Kinesis Data Stream
To put data in the my-input-stream Kinesis Data Stream, I use the following random_data_generator.py Python script simulating data from IoT sensors:

import datetime
import json
import random
import boto3

STREAM_NAME = "my-input-stream"


def get_random_data():
    current_temperature = round(10 + random.random() * 170, 2)
    if current_temperature > 160:
        status = "ERROR"
    elif current_temperature > 140 or random.randrange(1, 100) > 80:
        status = random.choice(["WARNING","ERROR"])
    else:
        status = "OK"
    return {
        'sensor_id': random.randrange(1, 100),
        'current_temperature': current_temperature,
        'status': status,
        'event_time': datetime.datetime.now().isoformat()
    }


def send_data(stream_name, kinesis_client):
    while True:
        data = get_random_data()
        partition_key = str(data["sensor_id"])
        print(data)
        kinesis_client.put_record(
            StreamName=stream_name,
            Data=json.dumps(data),
            PartitionKey=partition_key)


if __name__ == '__main__':
    kinesis_client = boto3.client('kinesis')
    send_data(STREAM_NAME, kinesis_client)

I start the script and see the records that are being put in the stream. They use a JSON syntax and contain random data.

$ python3 random_data_generator.py
{'sensor_id': 66, 'current_temperature': 69.67, 'status': 'OK', 'event_time': '2022-11-20T18:31:30.693395'}
{'sensor_id': 45, 'current_temperature': 122.57, 'status': 'OK', 'event_time': '2022-11-20T18:31:31.486649'}
{'sensor_id': 15, 'current_temperature': 101.64, 'status': 'OK', 'event_time': '2022-11-20T18:31:31.671593'}
...

Querying Streaming Data from Amazon Redshift
To compare the two materialized views, I select the first ten rows from each of them:

  • In the sensor_data materialized view, the JSON data in the stream is in the payload column. I can use Amazon Redshift JSON functions to access data stored in JSON format.Console screenshot.
  • In the sensor_data_extract materialized view, the JSON data in the stream has been extracted into different columns: sensor_id, current_temperature, status, and event_time.Console screenshot.

Now I can use the data in these views in my analytics workloads together with the data in my data warehouse, my operational databases, and my data lake. I can use the data in these views together with Redshift ML to train a machine learning model or use predictive analytics. Because materialized views support incremental updates, the data in these views can be efficiently used as a data source for dashboards, for example, using Amazon Redshift as a data source for Amazon Managed Grafana.

Availability and Pricing
Amazon Redshift streaming ingestion for Kinesis Data Streams and Managed Streaming for Apache Kafka is generally available today in all commercial AWS Regions.

There are no additional costs for using Amazon Redshift streaming ingestion. For more information, see Amazon Redshift pricing.

It’s never been easier to use low-latency streaming data in your data warehouse and in your data lake. Let us know what you build with this new capability!

Danilo

New – A Fully Managed Schema Conversion in AWS Database Migration Service

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-a-fully-managed-schema-conversion-in-aws-database-migration-service/

Since we launched AWS Database Migration Service (AWS DMS) in 2016, customers have securely migrated more than 800,000 databases to AWS with minimal downtime. AWS DMS supports migration between 20+ database and analytics engines, such as Oracle to Amazon Aurora MySQL, MySQL to Amazon Relational Database (Amazon RDS) MySQL, Microsoft SQL Server to Amazon Aurora PostgreSQL, MongoDB to Amazon DocumentDB, Oracle to Amazon Redshift, and to and from Amazon Simple Storage Service (Amazon S3).

Specifically, the AWS Schema Conversion Tool (AWS SCT) makes heterogeneous database and data warehouse migrations predictable and can automatically convert the source schema and a majority of the database code objects, including views, stored procedures, and functions, to a format compatible with the target engine. For example, it supports the conversion of Oracle PL/SQL and SQL Server T-SQL code to equivalent code in the Amazon Aurora MySQL dialect of SQL or the equivalent PL/pgSQL code in PostgreSQL. You can download the AWS SCT for your platform, including Windows or Linux (Fedora and Ubuntu).

Today we announce fully managed AWS DMS Schema Conversion, which streamlines database migrations by making schema assessment and conversion available inside AWS DMS. With DMS Schema Conversion, you can now plan, assess, convert and migrate under one central DMS service. You can access features of DMS Schema Conversion in the AWS Management Console without downloading and executing AWS SCT.

AWS DMS Schema Conversion automatically converts your source database schemas, and a majority of the database code objects to a format compatible with the target database. This includes tables, views, stored procedures, functions, data types, synonyms, and so on, similar to AWS SCT. Any objects that cannot be automatically converted are clearly marked as action items with prescriptive instructions on how to migrate to AWS manually.

In this launch, DMS Schema Conversion supports the following databases as sources for migration projects:

  • Microsoft SQL Server version 2008 R2 and higher
  • Oracle version 10.2 and later, 11g and up to 12.2, 18c, and 19c

DMS Schema Conversion supports the following databases as targets for migration projects:

  • Amazon RDS for MySQL version 8.x
  • Amazon RDS for PostgreSQL version 14.x

Setting Up AWS DMS Schema Conversion
To get started with DMS Schema Conversion, and if it is your first time using AWS DMS, complete the setup tasks to create a virtual private cloud (VPC) using the Amazon VPC service, source, and target database. To learn more, see Prerequisites for AWS Database Migration Service in the AWS documentation.

In the AWS DMS console, you can see new menus to set up Instance profiles, add Data providers, and create Migration projects.

Before you create your migration project, set up an instance profile by choosing Instance profiles in the left pane. An instance profile specifies network and security settings for your DMS Schema Conversion instances. You can create multiple instance profiles and select an instance profile to use for each migration project.

Choose Create instance profile and specify your default VPC or a new VPC, Amazon Simple Storage Service (Amazon S3) bucket to store your schema conversion metadata, and additional settings such as AWS Key Management Service (AWS KMS) keys.

You can create the simplest network configuration with a single VPC configuration. If your source or target data providers are in different VPCs, you can create your instance profile in one of the VPCs, and then link these two VPCs by using VPC peering.

Next, you can add data providers that store the data store type and location information about your source and target databases by choosing Data providers in the left pane. For each database, you can create a single data provider and use it in multiple migration projects.

Your data provider can be a fully managed Amazon RDS instance or a self-managed engine running either on-premises or on an Amazon Elastic Compute Cloud (Amazon EC2) instance.

Choose Create data provider to create a new data provider. You can set the type of the database location manually, such as database engine, domain name or IP address, port number, database name, and so on, for your data provider. Here, I have selected an RDS database instance.

After you create a data provider, make sure that you add database connection credentials in AWS Secrets Manager. DMS Schema Conversion uses this information to connect to a database.

Converting your database schema with AWS DMS Schema Conversion
Now, you can create a migration project for DMS Schema Conversion by choosing Migration projects in the left pane. A migration project describes your source and target data providers, your instance profile, and migration rules. You can also create multiple migration projects for different source and target data providers.

Choose Create migration project and select your instance profile and source and target data providers for DMS Schema Conversion.

After creating your migration project, you can use the project to create assessment reports and convert your database schema. Choose your migration project from the list, then choose the Schema conversion tab and click Launch schema conversion.

Migration projects in DMS Schema Conversion are always serverless. This means that AWS DMS automatically provisions the cloud resources for your migration projects, so you don’t need to manage schema conversion instances.

Of course, the first launch of DMS Schema Conversion requires starting a schema conversion instance, which can take up to 10–15 minutes. This process also reads the metadata from the source and target databases. After a successful first launch, you can access DMS Schema Conversion faster.

An important part of DMS Schema Conversion is that it generates a database migration assessment report that summarizes all of the schema conversion tasks. It also details the action items for schema that cannot be converted to the DB engine of your target database instance. You can view the report in the AWS DMS console or export it as a comma-separated value (.csv) file.

To create your assessment report, choose the source database schema or schema items that you want to assess. After you select the checkboxes, choose Assess in the Actions menu in the source database pane. This report will be archived with .csv files in your S3 bucket. To change the S3 bucket, edit the schema conversion settings in your instance profile.

Then, you can apply the converted code to your target database or save it as a SQL script. To apply converted code, choose Convert in the pane of Source data provider and then Apply changes in the pane of Target data provider.

Once the schema has been converted successfully, you can move on to the database migration phase using AWS DMS. To learn more, see Getting started with AWS Database Migration Service in the AWS documentation.

Now Available
AWS DMS Schema Conversion is now available in the US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) Regions, and you can start using it today.

To learn more, see the AWS DMS Schema Conversion User Guide, give it a try, and please send feedback to AWS re:Post for AWS DMS or through your usual AWS support contacts.

Channy

New – Amazon Redshift Support in AWS Backup

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-redshift-support-in-aws-backup/

With Amazon Redshift, you can analyze data in the cloud at any scale. Amazon Redshift offers native data protection capabilities to protect your data using automatic and manual snapshots. This works great by itself, but when you’re using other AWS services, you have to configure more than one tool to manage your data protection policies.

To make this easier, I am happy to share that we added support for Amazon Redshift in AWS Backup. AWS Backup allows you to define a central backup policy to manage data protection of your applications and can now also protect your Amazon Redshift clusters. In this way, you have a consistent experience when managing data protection across all supported services. If you have a multi-account setup, the centralized policies in AWS Backup let you define your data protection policies across all your accounts within your AWS Organizations. To help you meet your regulatory compliance needs, AWS Backup now includes Amazon Redshift in its auditor-ready reports. You also have the option to use AWS Backup Vault Lock to have immutable backups and prevent malicious or inadvertent changes.

Let’s see how this works in practice.

Using AWS Backup with Amazon Redshift
The first step is to turn on the Redshift resource type for AWS Backup. In the AWS Backup console, I choose Settings in the navigation pane and then, in the Service opt-in section, Configure resources. There, I toggle the Redshift resource type on and choose Confirm.

Console screenshot.

Now, I can create or update a backup plan to include the backup of all, or some, of my Redshift clusters. In the backup plan, I can define how often these backups should be taken and for how long they should be kept. For example, I can have daily backups with one week of retention, weekly backups with one month of retention, and monthly backups with one year of retention.

I can also create on-demand backups. Let’s see this with more details. I choose Protected resources in the navigation pane and then Create on-demand backup.

I select Redshift in the Resource type dropdown. In the Cluster identifier, I select one of my clusters. For this workload, I need two weeks of retention. Then, I choose Create on-demand backup.

Console screenshot.

My data warehouse is not huge, so after a few minutes, the backup job has completed.

Console screenshot.

I now see my Redshift cluster in the list of the resources protected by AWS Backup.

Console screenshot.

In the Protected resources list, I choose the Redshift cluster to see the list of the available recovery points.

Console screenshot.

When I choose one of the recovery points, I have the option to restore the full data warehouse or just a table into a new Redshift cluster.

Console screenshot.

I now have the possibility to edit the cluster and database configuration, including security and networking settings. I just update the cluster identifier, otherwise the restore would fail because it must be unique. Then, I choose Restore backup to start the restore job.

After some time, the restore job has completed, and I see the old and the new clusters in the Amazon Redshift console. Using AWS Backup gives me a simple centralized way to manage data protection for Redshift clusters as well as many other resources in my AWS accounts.

Console screenshot.

Availability and Pricing
Amazon Redshift support in AWS Backup is available today in the AWS Regions where both AWS Backup and Amazon Redshift are offered, with the exception of the Regions based in China. You can use this capability via the AWS Management Console, AWS Command Line Interface (CLI), and AWS SDKs.

There is no additional cost for using AWS Backup compared to the native snapshot capability of Amazon Redshift. Your overall costs depend on the amount of storage and retention you need. For more information, see AWS Backup pricing.

Danilo

UPDATE Supercloud SET status = ‘open alpha’ WHERE product = ‘D1’;

Post Syndicated from Nevi Shah original https://blog.cloudflare.com/d1-open-alpha/

UPDATE Supercloud SET status = 'open alpha' WHERE product = 'D1';

UPDATE Supercloud SET status = 'open alpha' WHERE product = 'D1';

In May 2022, we announced our quest to simplify databases – building them, maintaining them, integrating them. Our goal is to empower you with the tools to run a database that is powerful, scalable, with world-beating performance without any hassle. And we first set our sights on reimagining the database development experience for every type of user – not just database experts.

Over the past couple of months, we’ve been working to create just that, while learning some very important lessons along the way. As it turns out, building a global relational database product on top of Workers pushes the boundaries of the developer platform to their absolute limit, and often beyond them, but in a way that’s absolutely thrilling to us at Cloudflare. It means that while our progress might seem slow from outside, every improvement, bug fix or stress test helps lay down a path for all of our customers to build the world’s most ambitious serverless application.

However, as we continue down the road to making D1 production ready, it wouldn’t be “the Cloudflare way” unless we stopped for feedback first – even though it’s not quite finished yet. In the spirit of Developer Week, there is no better time to introduce the D1 open alpha!

An “open alpha” is a new concept for us. You’ll likely hear the term “open beta” on various announcements at Cloudflare, and while it makes sense for many products here, it wasn’t quite right for D1. There are still some crucial pieces that are still in active development and testing, so before we release the fully-formed D1 as a public beta for you to start building real-world apps with, we want to make sure everybody can start to get a feel for the product on their hobby apps or side-projects.

What’s included in the alpha?

While a lot is still changing behind the scenes with D1, we’ve put a lot of thought into how you, as a developer, interact with it – even if you’re new to databases.

Using the D1 dashboard

In a few clicks you can get your D1 database up and running right from within your dashboard. In our D1 interface, you can create, maintain and view your database as you please. Changes made in the UI are instantly available to your Worker – no redeploy required!

UPDATE Supercloud SET status = 'open alpha' WHERE product = 'D1';

Use Wrangler

If you’re looking to get your hands a little dirty, you can also work with your database using our Wrangler CLI. Create your database and begin adding your data manually or bootstrap your database with one of two ways:

1.  Execute an SQL file

$ wrangler d1 execute my-database-name --file ./customers.sql

where your .sql file looks something like this:

customers.sql

DROP TABLE IF EXISTS Customers;
CREATE TABLE Customers (CustomerID INT, CompanyName TEXT, ContactName TEXT, PRIMARY KEY (`CustomerID`));
INSERT INTO Customers (CustomerID, CompanyName, ContactName) 
VALUES (1, 'Alfreds Futterkiste', 'Maria Anders'),(4, 'Around the Horn', 'Thomas Hardy'),(11, 'Bs Beverages', 'Victoria Ashworth'),(13, 'Bs Beverages', 'Random Name');

2. Create and run migrations

Migrations are a way to version your database changes. With D1, you can create a migration and then apply it to your database.

To create the migration, execute:

wrangler d1 migrations create <my-database-name> <short description of migration>

This will create an SQL file in a migrations folder where you can then go ahead and add your queries. Then apply the migrations to your database by executing:

wrangler d1 migrations apply <my-database-name>

Access D1 from within your Worker

You can attach your D1 to a Worker by adding the D1 binding to your wrangler.toml configuration file. Then interact with D1 by executing queries inside your Worker like so:

export default {
 async fetch(request, env) {
   const { pathname } = new URL(request.url);

   if (pathname === "/api/beverages") {
     const { results } = await env.DB.prepare(
       "SELECT * FROM Customers WHERE CompanyName = ?"
     )
       .bind("Bs Beverages")
       .all();
     return Response.json(results);
   }

   return new Response("Call /api/beverages to see Bs Beverages customers");
 },
};

Or access D1 from within your Pages Function

In this Alpha launch, D1 also supports integration with Cloudflare Pages! You can add a D1 binding inside the Pages dashboard, and write your queries inside a Pages Function to build a full-stack application! Check out the full documentation to get started with Pages and D1.

Community built tooling

During our private alpha period, the excitement behind D1 led to some valuable contributions to the D1 ecosystem and developer experience by members of the community. Here are some of our favorite projects to date:

d1-orm

An Object Relational Mapping (ORM) is a way for you to query and manipulate data by using JavaScript. Created by a Cloudflare Discord Community Champion, the d1-orm seeks to provide a strictly typed experience while using D1:

const users = new Model(
    // table name, primary keys, indexes etc
    tableDefinition,
    // column types, default values, nullable etc
    columnDefinitions
)

// TS helper for typed queries
type User = Infer<type of users>;

// ORM-style query builder
const user = await users.First({
    where: {
        id: 1,
    },
});

You can check out the full documentation, and provide feedback by making an issue on the GitHub repository.

workers-qb

This is a zero-dependency query builder that provides a simple standardized interface while keeping the benefits and speed of using raw queries over a traditional ORM. While not intended to provide ORM-like functionality, workers-qb makes it easier to interact with the database from code for direct SQL access:

const qb = new D1QB(env.DB)

const fetched = await qb.fetchOne({
  tableName: 'employees',
  fields: 'count(*) as count',
  where: {
    conditions: 'department = ?1',
    params: ['HQ'],
  },
})

You can read more about the query builder here.

d1-console

Instead of running the wrangler d1 execute command in your terminal every time you want to interact with your database, you can interact with D1 from within the d1-console. Created by a Discord Community Champion, this gives the benefit of executing multi-line queries, obtaining command history, and viewing a cleanly formatted table output.

UPDATE Supercloud SET status = 'open alpha' WHERE product = 'D1';

While this is a community project today, we plan to natively support a “D1 Console” in the future. For now, get started by checking out the d1-console package here.

D1 adapter for Kysely

Kysely is a type-safe and autocompletion-friendly typescript SQL query builder. With this adapter you can interact with D1 with the familiar Kysely interface:

// Create Kysely instance with kysely-d1
const db = new Kysely<Database>({ 
  dialect: new D1Dialect({ database: env.DB })
});
    
// Read row from D1 table
const result = await db
  .selectFrom('kv')
  .selectAll()
  .where('key', '=', key)
  .executeTakeFirst();

Check out the project here.

What’s still in testing?

The biggest pieces that have been disabled for this alpha release are replication and JavaScript transaction support. While we’ll be rolling out these changes gradually, we want to call out some limitations that exist today that we’re actively working on testing:

  • Database location: Each D1 database only runs a single instance. It’s created close to where you, as the developer, create the database, and does not currently move regions based on access patterns. Workers running elsewhere in the world will see higher latency as a result.
  • Concurrency limitations: Under high load, read and write queries may be queued rather than triggering new replicas to be created. As a result, the performance & throughput characteristics of the open alpha won’t be representative of the final product.
  • Availability limitations: Backups will block access to the DB while they’re running. In most cases this should only be a second or two, and any requests that arrive during the backup will be queued.

You can also check out a more detailed, up-to-date list on D1 alpha Limitations.

Request for feedback

While we can make all sorts of guesses and bets on the kind of databases you want to use D1 for, we are not the users – you are! We want developers from all backgrounds to preview the D1 tech at its early stages, and let us know where we need to improve to make it suitable for your production apps.

For general feedback about your experience and to interact with other folks in the alpha, join our #d1-open-alpha channel in the Cloudflare Developers Discord. We plan to make any important announcements and changes in this channel as well as on our monthly community calls.

To file more specific feature requests (no matter how wacky) and report any bugs, create a thread in the Cloudflare Community forum under the D1 category. We will be maintaining this forum as a way to plan for the months ahead!

Get started

Want to get started right away? Check out our D1 documentation to get started today. Build our classic Northwind Traders demo to explore the D1 experience and deploy your first D1 database!

Introducing Amazon Neptune Serverless – A Fully Managed Graph Database that Adjusts Capacity for Your Workloads

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-neptune-serverless-a-fully-managed-graph-database-that-adjusts-capacity-for-your-workloads/

Amazon Neptune is a fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. With Neptune, you can use open and popular graph query languages to execute powerful queries that are easy to write and perform well on connected data. You can use Neptune for graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.

Neptune has always been fully managed and handles time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair. However, managing database capacity for optimal cost and performance requires you to monitor and reconfigure capacity as workload characteristics change. Also, many applications have variable or unpredictable workloads where the volume and complexity of database queries can change significantly. For example, a knowledge graph application for social media may see a sudden spike in queries due to sudden popularity.

Introducing Amazon Neptune Serverless
Today, we’re making that easier with the launch of Amazon Neptune Serverless. Neptune Serverless scales automatically as your queries and your workloads change, adjusting capacity in fine-grained increments to provide just the right amount of database resources that your application needs. In this way, you pay only for the capacity you use. You can use Neptune Serverless for development, test, and production workloads and optimize your database costs compared to provisioning for peak capacity.

With Neptune Serverless you can quickly and cost-effectively deploy graphs for your modern applications. You can start with a small graph, and as your workload grows, Neptune Serverless will automatically and seamlessly scale your graph databases to provide the performance you need. You no longer need to manage database capacity and you can now run graph applications without the risk of higher costs from over-provisioning or insufficient capacity from under-provisioning.

With Neptune Serverless, you can continue to use the same query languages (Apache TinkerPop Gremlin, openCypher, and RDF/SPARQL) and features (such as snapshots, streams, high availability, and database cloning) already available in Neptune.

Let’s see how this works in practice.

Creating an Amazon Neptune Serverless Database
In the Neptune console, I choose Databases in the navigation pane and then Create database. For Engine type, I select Serverless and enter my-database as the DB cluster identifier.

Console screenshot.

I can now configure the range of capacity, expressed in Neptune capacity units (NCUs), that Neptune Serverless can use based on my workload. I can now choose a template that will configure some of the next options for me. I choose the Production template that by default creates a read replica in a different Availability Zone. The Development and Testing template would optimize my costs by not having a read replica and giving access to DB instances that provide burstable capacity.

Console screenshot.

For Connectivity, I use my default VPC and its default security group.

Console screenshot.

Finally, I choose Create database. After a few minutes, the database is ready to use. In the list of databases, I choose the DB identifier to get the Writer and Reader endpoints that I am going to use later to access the database.

Using Amazon Neptune Serverless
There is no difference in the way you use Neptune Serverless compared to a provisioned Neptune database. I can use any of the query languages supported by Neptune. For this walkthrough, I choose to use openCypher, a declarative query language for property graphs originally developed by Neo4j that was open-sourced in 2015 and contributed to the openCypher project.

To connect to the database, I start an Amazon Linux Amazon Elastic Compute Cloud (Amazon EC2) instance in the same AWS Region and associate the default security group and a second security group that gives me SSH access.

With a property graph I can represent connected data. In this case, I want to create a simple graph that shows how some AWS services are part of a service category and implement common enterprise integration patterns.

I use curl to access the Writer openCypher HTTPS endpoint and create a few nodes that represent patterns, services, and service categories. The following commands are split into multiple lines in order to improve readability.

curl https://<my-writer-endpoint>:8182/openCypher \
-d "query=CREATE (mq:Pattern {name: 'Message Queue'}),
(pubSub:Pattern {name: 'Pub/Sub'}),
(eventBus:Pattern {name: 'Event Bus'}),
(workflow:Pattern {name: 'WorkFlow'}),
(applicationIntegration:ServiceCategory {name: 'Application Integration'}),
(sqs:Service {name: 'Amazon SQS'}), (sns:Service {name: 'Amazon SNS'}),
(eventBridge:Service {name: 'Amazon EventBridge'}), (stepFunctions:Service {name: 'AWS StepFunctions'}),
(sqs)-[:IMPLEMENT]->(mq), (sns)-[:IMPLEMENT]->(pubSub),
(eventBridge)-[:IMPLEMENT]->(eventBus),
(stepFunctions)-[:IMPLEMENT]->(workflow),
(applicationIntegration)-[:CONTAIN]->(sqs),
(applicationIntegration)-[:CONTAIN]->(sns),
(applicationIntegration)-[:CONTAIN]->(eventBridge),
(applicationIntegration)-[:CONTAIN]->(stepFunctions);"

This is a visual representation of the nodes and their relationships for the graph created by the previous command. The type (such as Service or Pattern) and properties (such as name) are shown inside each node. The arrows represent the relationships (such as CONTAIN or IMPLEMENT) between the nodes.

Visualization of graph data.

Now, I query the database to get some insights. To query the database, I can use either a Writer or a Reader endpoint. First, I want to know the name of the service implementing the “Message Queue” pattern. Note how the syntax of openCypher resembles that of SQL with MATCH instead of SELECT.

curl https://<my-endpoint>:8182/openCypher \
-d "query=MATCH (s:Service)-[:IMPLEMENT]->(p:Pattern {name: 'Message Queue'}) RETURN s.name;"
{
  "results" : [ {
    "s.name" : "Amazon SQS"
  } ]
}

I use the following query to see how many services are in the “Application Integration” category. This time, I use the WHERE clause to filter results.

curl https://<my-endpoint>:8182/openCypher \
-d "query=MATCH (c:ServiceCategory)-[:CONTAIN]->(s:Service) WHERE c.name='Application Integration' RETURN count(s);"
{
  "results" : [ {
    "count(s)" : 4
  } ]
}

There are many options now that I have this graph database up and running. I can add more data (services, categories, patterns) and more relationships between the nodes. I can focus on my application and let Neptune Serverless manage capacity and infrastructure for me.

Availability and Pricing
Amazon Neptune Serverless is available today in the following AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Tokyo), and Europe (Ireland, London).

With Neptune Serverless, you only pay for what you use. The database capacity is adjusted to provide the right amount of resources you need in terms of Neptune capacity units (NCUs). Each NCU is a combination of approximately 2 gibibytes (GiB) of memory with corresponding CPU and networking. The use of NCUs is billed per second. For more information, see the Neptune pricing page.

Having a serverless graph database opens many new possibilities. To learn more, see the Neptune Serverless documentation. Let us know what you build with this new capability!

Simplify the way you work with highly connected data using Neptune Serverless.

Danilo

Backups to the rescue!

Post Syndicated from Nathan Liefting original https://blog.zabbix.com/backups-to-the-rescue/23442/

In this blog post, you will learn how to set up backups for your Zabbix environment. There’s a wide variety of different options when it comes to taking backups of our Zabbix environment, for us, it will just be a matter of choosing the right fit.

 

Introduction

Monitoring is an important part of our IT infrastructure and often times when our monitoring isn’t working for a certain period, we feel like we are blind as to what is going on with our different IT components. As such, taking backups of our Zabbix environment is an important part of running a production Zabbix environment, as we do want to be prepared for a possible issue that might corrupt or even lose our data. It’s always a possibility and as such we should be prepared.

For Zabbix, there are a few different methods on how to take backups and it all starts at the database level. Both the Zabbix frontend as well as the Zabbix server write their data into the Zabbix database as we can see in the illustration below:

This means that both our configuration as well as all of our collected values are present in the same Zabbix database and if we take a database backup, we back up (almost) everything we need. So, let’s start there and have a look at how we can make a database backup.

How to

MySQL backups

Let’s start with the most used variant of Zabbix databases: MySQL and it’s forks like MariaDB and Percona. All of them can easily be backed up using built-in functionality like the MySQL Dump command and we can then use other industry standards to get things going. First, we have to understand the tables in our database though. Most of the tables in your Zabbix environment contain configuration data and as such, they are all important to backup. There are a few tables that we need to consider, however, as they can contain Giga or even Terabytes of data. These are the History, Trends and Events tables:

It is possible to omit these tables from your backup and make smaller, more manageable backups. To make the backup we can then start using tools like MySQL Dump:

Once we have taken a backup, we can easily import that back into our environment using the MySQL Import command or simply using the cat command:

Do not forget, taking and importing large backups can take a long time. This completely depends on your MySQL database performance tuning settings as well as the underlying resources like CPU, Memory and Disk I/O. Also, make sure to check out the MySQL documentation:

MySQL Dump:  https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html / https://mariadb.com/kb/en/making-backups-with-mysqldump/

MySQL Import: https://dev.mysql.com/doc/refman/8.0/en/mysqlimport.html / https://mariadb.com/kb/en/mysqlimport/

 

Alternatively, it’s also possible to create backups using tools like xtrabackup and mariadbbackup.

PostgreSQL backups

We can actually use the same kinds of methods for the PostgreSQL backups. Keep the required tables in mind and fire away with the built-in tools:

 

Then we can restore it by loading the file into postgres:

What about the configuration files?

Once we have a database backup, everything is backed up, right? Well, almost everything. With just a database backup we are quite safe, but (and this is oftentimes overlooked) there are a lot of configuration files and perhaps even custom scripts we need to take into account! There are three parts to this story – the Zabbix server, the Zabbix frontend, and also the Zabbix additional components. All of them have their own set of configuration files and locations that are used for storing custom scripts.

The Zabbix frontend location and configuration files can be different, depending on the environment, as we have a few choices to make. Are we running Apache or Nginx? On what Linux distribution? All of these have to be considered when making configuration backups. In general, the locations for the configuration would be:

/etc/nginx/
/etc/httpd/
/etc/apache2

There’s also a symlink to the Zabbix frontend configuration file located in /etc/zabbix/ but we will get to that one in a bit.

Then we have the Zabbix server itself, which keeps its configuration in /etc/zabbix/ and if we’re following best practices any script should be placed in /usr/lib/zabbix. So we need:

/etc/zabbix/
/usr/lib/zabbix

Let’s add them to the list and find a method to back up these files. Crontab is a built-in tool that we can use, but there are definitely other (perhaps better) solutions out there. Let’s add the following to cron:

I also added a find command here, which will serve as our roll-over or rotation toll. It will find files older than 180 days and delete them from /mnt/backup/config_files/. Make sure to pick a good (network) folder to store these files as it’s important to keep these safe. Feel free to change the number of days you’d like to store the files for.

What about the additional components like Zabbix proxy, Zabbix Java gateway and Zabbix web service (used for PDF reporting)?. Well, these have configuration files as well. Make sure to run a backup on the devices running these additional components. As for Zabbix proxies – they have the same file locations as Zabbix server:

For Zabbix Java gateway and Zabbix web service, we can omit the /usr/lib/zabbix/ folder.

Don’t forget the import/export files!

In general, database backups are slow to make, but also slow to import back unless we do not include the history/trends in the backup. But even then, restoring an entire database simply because someone made an error on a single template is a hassle. Zabbix ships with the built-in frontend export functionality, allowing us to export (and then import) entire parts of the configuration instantly! We can use these for a number of different parts of the configuration:

  • Hosts
  • Templates
  • Media types
  • Maps
  • images
  • Host groups (API ONLY)
  • Template groups (API ONLY)

All of these are available through the Zabbix API allowing us to choose whether we do a manual configuration backup from the frontend, as well as providing us with automation options using that API. You could even manage and update your Zabbix configuration from GIT entirely if you write the right scripts for this.

Frontend backups

To run an export from the frontend simply go to one of the supported sections like Configuration | Templates and select the export data format. When selecting multiple entities, keep in mind that they will all be exported to a single file.

We can then make our edits and import files from the frontend as well:

For Templates this will even result in a nice diff pop-up window, detailing all the changes, deletes and additions to the templates:

 

API backups

For the API things get a little more complicated as we need to select a mode of execution. Of course, it’s possible to do a curl command from the CLI or even use something like Postman:

Request body

The response will then look something like this:

But this feature really starts to shine once we combine it with our own automation scripts. Use it wisely!

High availability

So, what about high availability? Isn’t that some form of a backup?

Well yes and no. High availability is not an “IT backup” in the form of making sure we can recover something that is broken. But it is a backup in the way that if a Zabbix server instance fails, another one takes over for it. HA is somewhat out of scope for this blog post, but it’s still worth mentioning. There are several solutions to set up Zabbix as a full high availability cluster. For MySQL we can use a Primary/Primary setup, for the frontend we can use load balancing techniques like HAProxy and for the Zabbix server, we can use the built-in high availability method. Combine all of these together and you’ll definitely be able to serve your every (production ready!) need.

Conclusion

To conclude, there are many options to start taking backups of our Zabbix environment. It all starts at the database and these backups are definitely vital to keep things safe in case of disaster. When making the backups, do not forget about the configuration files and custom scripts as well as the frontend backup option. Combining all of these solutions will safeguard our environment, but if that isn’t enough – do not forget about industry standards like snapshots. Even further safeguarding our environment on multiple levels.

I hope you enjoyed reading this blog post. If you have any questions or need help configuring anything on your Zabbix setup feel free to contact me and the team at Opensource ICT Solutions. We build a ton of cool integrations like this and much more!

Nathan Liefting

https://oicts.com

A close up of a logo Description automatically generated

The post Backups to the rescue! appeared first on Zabbix Blog.

D1: our quest to simplify databases

Post Syndicated from Nevi Shah original https://blog.cloudflare.com/whats-new-with-d1/

D1: our quest to simplify databases

D1: our quest to simplify databases

When we announced D1 in May of this year, we knew it would be the start of something new – our first SQL database with Cloudflare Workers. Prior to D1 we’ve announced storage options like KV (key-value store), Durable Objects (single location, strongly consistent data storage) and R2 (blob storage). But the question always remained “How can I store and query relational data without latency concerns and an easy API?”

The long awaited “Cloudflare Database” was the true missing piece to build your application entirely on Cloudflare’s global network, going from a blank canvas in VSCode to a full stack application in seconds. Compatible with the popular SQLite API, D1 empowers developers to build out their databases without getting bogged down by complexity and having to manage every underlying layer.

Since our launch announcement in May and private beta in June, we’ve made great strides in building out our vision of a serverless database. With D1 still in private beta but an open beta on the horizon, we’re excited to show and tell our journey of building D1 and what’s to come.

The D1 Experience

We knew from Cloudflare Workers feedback that using Wrangler as the mechanism to create and deploy applications is loved and preferred by many. That’s why when Wrangler 2.0 was announced this past May alongside D1, we took advantage of the new and improved CLI for every part of the experience from data creation to every update and iteration. Let’s take a quick look on how to get set up in a few easy steps.

Create your database

With the latest version of Wrangler installed, you can create an initialized empty database with a quick

npx wrangler d1 create my_database_name

To get your database up and running! Now it’s time to add your data.

Bootstrap it

It wouldn’t be the “Cloudflare way” if you had to sit through an agonizingly long process to get set up. So we made it easy and painless to bring your existing data from an old database and bootstrap your new D1 database.  You can run

wrangler d1 execute my_database-name --file ./filename.sql

and pass through an existing SQLite .sql file of your choice. Your database is now ready for action.

Develop & Test Locally

With all the improvements we’ve made to Wrangler since version 2 launched a few months ago, we’re pleased to report that D1 has full remote & local wrangler dev support:

D1: our quest to simplify databases

When running wrangler dev -–local -–persist, an SQLite file will be created inside .wrangler/state. You can then use a local GUI program for managing it, like SQLiteFlow (https://www.sqliteflow.com/) or Beekeeper (https://www.beekeeperstudio.io/).

Or you can simply use SQLite directly with the SQLite command line by running sqlite3 .wrangler/state/d1/DB.sqlite3:

D1: our quest to simplify databases

Automatic backups & one-click restore

No matter how much you test your changes, sometimes things don’t always go according to plan. But with Wrangler you can create a backup of your data, view your list of backups or restore your database from an existing backup. In fact, during the beta, we’re taking backups of your data every hour automatically and storing them in R2, so you will have the option to rollback if needed.

D1: our quest to simplify databases

And the best part – if you want to use a production snapshot for local development or to reproduce a bug, simply copy it into the .wrangler/state directory and wrangler dev –-local –-persist will pick it up!

Let’s download a D1 backup to our local disk. It’s SQLite compatible.

D1: our quest to simplify databases

Now let’s run our D1 worker locally, from the backup.

D1: our quest to simplify databases

Create and Manage from the dashboard

However, we realize that CLIs are not everyone’s jam. In fact, we believe databases should be accessible to every kind of developer – even those without much database experience! D1 is available right from the Cloudflare dashboard giving you near total command parity with Wrangler in just a few clicks. Bootstrapping your database, creating tables, updating your database, viewing tables and triggering backups are all accessible right at your fingertips.

D1: our quest to simplify databases

Changes made in the UI are instantly available to your Worker — no deploy required!

We’ve told you about some of the improvements we’ve landed since we first announced D1, but as always, we also wanted to give you a small taste (with some technical details) of what’s ahead. One really important functionality of a database is transactions — something D1 wouldn’t be complete without.

Sneak peek: how we’re bringing JavaScript transactions to D1

With D1, we strive to present a dramatically simplified interface to creating and querying relational data, which for the most part is a good thing. But simplification occasionally introduces drawbacks, where a use-case is no longer easily supported without introducing some new concepts. D1 transactions are one example.

Transactions are a unique challenge

You don’t need to specify where a Cloudflare Worker or a D1 database run—they simply run everywhere they need to. For Workers, that is as close as possible to the users that are hitting your site right this second. For D1 today, we don’t try to run a copy in every location worldwide, but dynamically manage the number and location of read-only replicas based on how many queries your database is getting, and from where. However, for queries that make changes to a database (which we generally call “writes” for short), they all have to travel back to the single Primary D1 instance to do their work, to ensure consistency.

But what if you need to do a series of updates at once? While you can send multiple SQL queries with .batch() (which does in fact use database transactions under the hood), it’s likely that, at some point, you’ll want to interleave database queries & JS code in a single unit of work.

This is exactly what database transactions were invented for, but if you try running BEGIN TRANSACTION in D1 you’ll get an error. Let’s talk about why that is.

Why native transactions don’t work
The problem arises from SQL statements and JavaScript code running in dramatically different places—your SQL executes inside your D1 database (primary for writes, nearest replica for reads), but your Worker is running near the user, which might be on the other side of the world. And because D1 is built on SQLite, only one write transaction can be open at once. Meaning that, if we permitted BEGIN TRANSACTION, any one Worker request, anywhere in the world, could effectively block your whole database! This is a quite dangerous thing to allow:

  • A Worker could start a transaction then crash due to a software bug, without calling ROLLBACK. The primary would be blocked, waiting for more commands from a Worker that would never come (until, probably, some timeout).
  • Even without bugs or crashes, transactions that require multiple round-trips between JavaScript and SQL could end up blocking your whole system for multiple seconds, dramatically limiting how high an application built with Workers & D1 could scale.

But allowing a developer to define transactions that mix both SQL and JavaScript makes building applications with Workers & D1 so much more flexible and powerful. We need a new solution (or, in our case, a new version of an old solution).

A way forward: stored procedures
Stored procedures are snippets of code that are uploaded to the database, to be executed directly next to the data. Which, at first blush, sounds exactly like what we want.

However, in practice, stored procedures in traditional databases are notoriously frustrating to work with, as anyone who’s developed a system making heavy use of them will tell you:

  • They’re often written in a different language to the rest of your application. They’re usually written in (a specific dialect of) SQL or an embedded language like Tcl/Perl/Python. And while it’s technically possible to write them in JavaScript (using an embedded V8 engine), they run in such a different environment to your application code it still requires significant context-switching to maintain them.
  • Having both application code and in-database code affects every part of the development lifecycle, from authoring, testing, deployment, rollbacks and debugging. But because stored procedures are usually introduced to solve a specific problem, not as a general purpose application layer, they’re often managed completely manually. You can end up with them being written once, added to the database, then never changed for fear of breaking something.

With D1, we can do better.

The point of a stored procedure was to execute directly next to the data—uploading the code and executing it inside the database was simply a means to that end. But we’re using Workers, a global JavaScript execution platform, can we use them to solve this problem?

It turns out, absolutely! But here we have a few options of exactly how to make it work, and we’re working with our private beta users to find the right API. In this section, I’d like to share with you our current leading proposal, and invite you all to give us your feedback.

When you connect a Worker project to a D1 database, you add the section like the following to your wrangler.toml:

[[ d1_databases ]]
# What binding name to use (e.g. env.DB):
binding = "DB"
# The name of the DB (used for wrangler d1 commands):
database_name = "my-d1-database"
# The D1's ID for deployment:
database_id = "48a4224e-...3b09"
# Which D1 to use for `wrangler dev`:
# (can be the same as the previous line)
preview_database_id = "48a4224e-...3b09"

# NEW: adding "procedures", pointing to a new JS file:
procedures = "./src/db/procedures.js"

That D1 Procedures file would contain the following (note the new db.transaction() API, that is only available within a file like this):

export default class Procedures {
  constructor(db, env, ctx) {
    this.db = db
  }

  // any methods you define here are available on env.DB.Procedures
  // inside your Worker
  async Checkout(cartId: number) {
    // Inside a Procedure, we have a new db.transaction() API
    const result = await this.db.transaction(async (txn) => {
      
      // Transaction has begun: we know the user can't add anything to
      // their cart while these actions are in progress.
      const [cart, user] = Helpers.loadCartAndUser(cartId)

      // We can update the DB first, knowing that if any of the later steps
      // fail, all these changes will be undone.
      await this.db
        .prepare(`UPDATE cart SET status = ?1 WHERE cart_id = ?2`)
        .bind('purchased', cartId)
        .run()
      const newBalance = user.balance - cart.total_cost
      await this.db
        .prepare(`UPDATE user SET balance = ?1 WHERE user_id = ?2`)
        // Note: the DB may have a CHECK to guarantee 'user.balance' can not
        // be negative. In that case, this statement may fail, an exception
        // will be thrown, and the transaction will be rolled back.
        .bind(newBalance, cart.user_id)
        .run()

      // Once all the DB changes have been applied, attempt the payment:
      const { ok, details } = await PaymentAPI.processPayment(
        user.payment_method_id,
        cart.total_cost
      )
      if (!ok) {
        // If we throw an Exception, the transaction will be rolled back
        // and result.error will be populated:
        // throw new PaymentFailedError(details)
        
        // Alternatively, we can do both of those steps explicitly
        await txn.rollback()
        // The transaction is rolled back, our DB is now as it was when we
        // started. We can either move on and try something new, or just exit.
        return { error: new PaymentFailedError(details) }
      }

      // This is implicitly called when the .transaction() block finishes,
      // but you can explicitly call it too (potentially committing multiple
      // times in a single db.transaction() block).
      await txn.commit()

      // Anything we return here will be returned by the 
      // db.transaction() block
      return {
        amount_charged: cart.total_cost,
        remaining_balance: newBalance,
      }
    })

    if (result.error) {
      // Our db.transaction block returned an error or threw an exception.
    }

    // We're still in the Procedure, but the Transaction is complete and
    // the DB is available for other writes. We can either do more work
    // here (start another transaction?) or return a response to our Worker.
    return result
  }
}

And in your Worker, your DB binding now has a “Procedures” property with your function names available:

const { error, amount_charged, remaining_balance } =
  await env.DB.Procedures.Checkout(params.cartId)

if (error) {
  // Something went wrong, `error` has details
} else {
  // Display `amount_charged` and `remaining_balance` to the user.
}

Multiple Procedures can be triggered at one time, but only one db.transaction() function can be active at once: any other write queries or other transaction blocks will be queued, but all read queries will continue to hit local replicas and run as normal. This API gives you the ability to ensure consistency when it’s essential but with the minimal impact on total overall performance worldwide.

Request for feedback

As with all our products, feedback from our users drives the roadmap and development. While the D1 API is in beta testing today, we’re still seeking feedback on the specifics. However, we’re pleased that it solves both the problems with transactions that are specific to D1 and the problems with stored procedures described earlier:

  • Code is executing as close as possible to the database, removing network latency while a transaction is open.
  • Any exceptions or cancellations of a transaction cause an instant rollback—there is no way to accidentally leave one open and block the whole D1 instance.
  • The code is in the same language as the rest of your Worker code, in the exact same dialect (e.g. same TypeScript config as it’s part of the same build).
  • It’s deployed seamlessly as part of your Worker. If two Workers bind to the same D1 instance but define different procedures, they’ll only see their own code. If you want to share code between projects or databases, extract a library as you would with any other shared code.
  • In local development and test, the procedure works just like it does in production, but without the network call, allowing seamless testing and debugging as if it was a local function.
  • Because procedures and the Worker that define them are treated as a single unit, rolling back to an earlier version never causes a skew between the code in the database and the code in the Worker.

The D1 ecosystem: contributions from the community

We’ve told you about what we’ve been up to and what’s ahead, but one of the unique things about this project is all the contributions from our users. One of our favorite parts of private betas is not only getting feedback and feature requests, but also seeing what ideas and projects come to fruition. While sometimes this means personal projects, with D1, we’re seeing some incredible contributions to the D1 ecosystem. Needless to say, the work on D1 hasn’t just been coming from within the D1 team, but also from the wider community and other developers at Cloudflare. Users have been showing off their D1 additions within our Discord private beta channel and giving others the opportunity to use them as well. We wanted to take a moment to highlight them.

workers-qb

Dealing with raw SQL syntax is powerful (and using the D1 .bind() API, safe against SQL injections) but it can be a little clumsy. On the other hand, most existing query builders assume direct access to the underlying DB, and so aren’t suitable to use with D1. So Cloudflare developer Gabriel Massadas designed a small, zero-dependency query builder called workers-qb:

import { D1QB } from 'workers-qb'
const qb = new D1QB(env.DB)

const fetched = await qb.fetchOne({
    tableName: "employees",
    fields: "count(*) as count",
    where: {
      conditions: "active = ?1",
      params: [true]
    },
})

Check out the project homepage for more information: https://workers-qb.massadas.com/.

D1 console

While you can interact with D1 through both Wrangler and the dashboard, Cloudflare Community champion, Isaac McFadyen created the very first D1 console where you can quickly execute a series of queries right through your terminal. With the D1 console, you don’t need to spend time writing the various Wrangler commands we’ve created – just execute your queries.

This includes all bells and whistles you would expect from a modern database console including multiline input, command history, validation for things D1 may not yet support, and ability to save your Cloudflare credentials for later use.

Check out the full project on GitHub or NPM for more information.

Miniflare test Integration

The Miniflare project, which powers Wrangler’s local development experience, also provides fully-fledged test environments for popular JavaScript test runners, Jest and Vitest. With this comes the concept of Isolated Storage, allowing each test to run independently, so that changes made in one don’t affect the others. Brendan Coll, creator of Miniflare, guided the D1 test implementation to give the same benefits:

import Worker from ‘../src/index.ts’
const { DB } = getMiniflareBindings();

beforeAll(async () => {
  // Your D1 starts completely empty, so first you must create tables
  // or restore from a schema.sql file.
  await DB.exec(`CREATE TABLE entries (id INTEGER PRIMARY KEY, value TEXT)`);
});

// Each describe block & each test gets its own view of the data.
describe(‘with an empty DB’, () => {
  it(‘should report 0 entries’, async () => {
    await Worker.fetch(...)
  })
  it(‘should allow new entries’, async () => {
    await Worker.fetch(...)
  })
])

// Use beforeAll & beforeEach inside describe blocks to set up particular DB states for a set of tests
describe(‘with two entries in the DB’, () => {
  beforeEach(async () => {
    await DB.prepare(`INSERT INTO entries (value) VALUES (?), (?)`)
            .bind(‘aaa’, ‘bbb’)
            .run()
  })
  // Now, all tests will run with a DB with those two values
  it(‘should report 2 entries’, async () => {
    await Worker.fetch(...)
  })
  it(‘should not allow duplicate entries’, async () => {
    await Worker.fetch(...)
  })
])

All the databases for tests are run in-memory, so these are lightning fast. And fast, reliable testing is a big part of building maintainable real-world apps, so we’re thrilled to extend that to D1.

Want access to the private beta?

Feeling inspired?

We love to see what our beta users build or want to build especially when our products are at an early stage. As we march toward an open beta, we’ll be looking specifically for your feedback. We are slowly letting more folks into the beta, but if you haven’t received your “golden ticket” yet with access, sign up here! Once you’ve been invited in, you’ll receive an official welcome email.

As always, happy building!

Optimizing your AWS Infrastructure for Sustainability, Part IV: Databases

Post Syndicated from Otis Antoniou original https://aws.amazon.com/blogs/architecture/optimizing-your-aws-infrastructure-for-sustainability-part-iv-databases/

In Part I: Compute, Part II: Storage, and Part III: Networking of this series, we introduced strategies to optimize the compute, storage, and networking layers of your AWS architecture for sustainability.

This post, Part IV, focuses on the database layer and proposes recommendations to optimize your databases’ utilization, performance, and queries. These recommendations are based on design principles of AWS Well-Architected Sustainability Pillar.

Optimizing the database layer of your AWS infrastructure

AWS database services

Figure 1. AWS database services

As your application serves more customers, the volume of data stored within your databases will increase. Implementing the recommendations in the following sections will help you use databases resources more efficiently and save costs.

Use managed databases

Usually, customers overestimate the capacity they need to absorb peak traffic, wasting resources and money on unused infrastructure. AWS fully managed database services provide continuous monitoring, which allows you to increase and decrease your database capacity as needed. Additionally, most AWS managed databases use a pay-as-you-go model based on the instance size and storage used.

Managed services shift responsibility to AWS for maintaining high average utilization and sustainability optimization of the deployed hardware. Amazon Relational Database Service (Amazon RDS) reduces your individual contribution compared to maintaining your own databases on Amazon Elastic Compute Cloud (Amazon EC2). In a managed database, AWS continuously monitors your clusters to keep your workloads running with self-healing storage and automated scaling.

AWS offers 15+ purpose-built engines to support diverse data models. For example, if an Internet of Things (IoT) application needs to process large amounts of time series data, Amazon Timestream is designed and optimized for this exact use case.

Rightsize, reduce waste, and choose the right hardware

To see metrics, thresholds, and actions you can take to identify underutilized instances and rightsizing opportunities, Optimizing costs in Amazon RDS provides great guidance. The following table provides additional tools and metrics for you to find unused resources:

Service Metric Source
Amazon RDS DatabaseConnections Amazon CloudWatch
Amazon RDS Idle DB Instances AWS Trusted Advisor
Amazon DynamoDB AccountProvisionedReadCapacityUtilization, AccountProvisionedWriteCapacityUtilization, ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits CloudWatch
Amazon Redshift Underutilized Amazon Redshift Clusters AWS Trusted Advisor
Amazon DocumentDB DatabaseConnections, CPUUtilization, FreeableMemory CloudWatch
Amazon Neptune CPUUtilization, VolumeWriteIOPs, MainRequestQueuePendingRequests CloudWatch
Amazon Keyspaces ProvisionedReadCapacityUnits, ProvisionedWriteCapacityUnits, ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits CloudWatch

These tools will help you identify rightsizing opportunities. However, rightsizing databases can affect your SLAs for query times, so consider this before making changes.

We also suggest:

  • Evaluating if your existing SLAs meet your business needs or if they could be relaxed as an acceptable trade-off to optimize your environment for sustainability.
  • If any of your RDS instances only need to run during business hours, consider shutting them down outside business hours either manually or with Instance Scheduler.
  • Consider using a more power-efficient processor like AWS Graviton-based instances for your databases. Graviton2 delivers 2-3.5 times better CPU performance per watt than any other processor in AWS.

Make sure to choose the right RDS instance type for the type of workload you have. For example, burstable performance instances can deal with spikes that exceed the baseline without the need to overprovision capacity. In terms of storage, Amazon RDS provides three storage types that differ in performance characteristics and price, so you can tailor the storage layer of your database according to your needs.

Use serverless databases

Production databases that experience intermittent, unpredictable, or spiky traffic may be underutilized. To improve efficiency and eliminate excess capacity, scale your infrastructure according to its load.

AWS offers relational and non-relational serverless databases that shut off when not in use, quickly restart, and automatically scale database capacity based on your application’s needs. This reduces your environmental impact because capacity management is automatically optimized. By selecting the best purpose-built database for your workload, you’ll benefit from the scalability and fully-managed experience of serverless database services, as shown in the following table.

 

Serverless Relational Databases Serverless Non-relational Databases
Amazon Aurora Serverless for an on-demand, autoscaling configuration Amazon DynamoDB (in On-Demand mode) for a fully managed, serverless, key-value NoSQL database
Amazon Redshift Serverless runs and scales data warehouse capacity; you don’t need to set up and manage data warehouse infrastructure Amazon Timestream for a time series database service for IoT and operational applications
Amazon Keyspaces for a scalable, highly available, and managed Apache Cassandra–compatible database service
Amazon Quantum Ledger Database for a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log ‎owned by a central trusted authority

Use automated database backups and remove redundant data

Manual Amazon RDS backups, unlike automated backups, take a manual snapshot of your database and do not have a retention period set by default. This means that unless you delete a manual snapshot, it will not be removed automatically. Removing manual snapshots you don’t need will use fewer resources, which will reduce your costs. If you want manual snapshots of RDS, you can set an “expiration” with AWS Backup. To keep long-term snapshots of MariaDB, MySQL, and PostgreSQL data, we recommend exporting snapshot data to Amazon Simple Storage Service (Amazon S3). You can also export specific tables or databases. This way, you can move data to “colder” longer-term archival storage instead of keeping it within your database.

Optimize long running queries

Identify and optimize queries that are resource intensive because they can affect the overall performance of your application. By using the Performance Insights dashboard, specifically the Top Dimensions table, which displays the Top SQL, waits, and hosts, you’ll be able to view and download SQL queries to diagnose and investigate further.

Tuning Amazon RDS for MySQL with Performance Insights and this knowledge center article will help you optimize and tune queries in Amazon RDS for MySQL. The Optimizing and tuning queries in Amazon RDS PostgreSQL based on native and external tools and Improve query performance with parallel queries in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition blog posts outline how to use native and external tools to optimize and tune Amazon RDS PostgreSQL queries, as well as improve query performance using the parallel query feature.

Improve database performance

You can improve your database performance by monitoring, identifying, and remediating anomalous performance issues. Instead of relying on a database administrator (DBA), AWS offers native tools to continuously monitor and analyze database telemetry, as shown in the following table.

Service CloudWatch Metric Source
Amazon DynamoDB CPUUtilization, FreeStorageSpace CloudWatch
Amazon Redshift CPUUtilization, PercentageDiskSpaceUsed CloudWatch
Amazon Aurora CPUUtilization, FreeLocalStorage Amazon RDS
DynamoDB AccountProvisionedReadCapacityUtilization, AccountProvisionedWriteCapacityUtilization CloudWatch
Amazon ElastiCache CPUUtilization CloudWatch

CloudWatch displays instance-level and account-level usage metrics for Amazon RDS. Create CloudWatch alarms to activate and notify you based on metric value thresholds you specify or when anomalous metric behavior is detected. Enable Enhanced Monitoring real-time metrics for the operating system the DB instance runs on.

Amazon RDS Performance Insights collects performance metrics, such as database load, from each RDS DB instance. This data gives you a granular view of the databases’ activity every second. You can enable Performance Insights without causing downtime, reboot, or failover.

Amazon DevOps Guru for RDS uses the data from Performance Insights, Enhanced Monitoring, and CloudWatch to identify operational issues. It uses machine learning to detect and notify of database-related issues, including resource overutilization or misbehavior of certain SQL queries.

Conclusion

In this blog post, we discussed technology choices, design principles, and recommended actions to optimize and increase efficiency of your databases. As your data grows, it is important to scale your database capacity in line with your user load, remove redundant data, optimize database queries, and optimize database performance. Figure 2 shows an overview of the tools you can use to optimize your databases.

Figure 2. Tools you can use on AWS for optimization purposes

Figure 2. Tools you can use on AWS for optimization

Other blog posts in this series

How we store and process millions of orders daily

Post Syndicated from Grab Tech original https://engineering.grab.com/how-we-store-millions-orders

Introduction

In the real world, after a passenger places a GrabFood order from the Grab App, the merchant-partner will prepare the order. A driver-partner will then collect the food and deliver it to the passenger. Have you ever wondered what happens in the backend system? The Grab Order Platform is a distributed system that processes millions of GrabFood or GrabMart orders every day. This post aims to share the journey of how we designed the database solution that powers the order platform.

Background

What are the design goals when building the database solution? We collected the requirements by analysing query patterns and traffic patterns.

Query patterns

Here are some important query examples that the Order Platform supports:

  1. Write queries:

    a. Create an order.

    b. Update an order.

  2. Read queries:

    a. Get order by id.

    b. Get ongoing orders by passenger id.

    c. Get historical orders by various conditions.

    d. Get order statistics (for example, get the number of orders)

We can break down queries into two categories: transactional queries and analytical queries. Transactional queries are critical to online order creation and completion, including the write queries and read queries such as 2a or 2b. Analytical queries like 2c and 2d retrieves historical orders or order statistics on demand. Analytical queries are not essential to the oncall order processing.

Traffic patterns

Grab’s Order Platform processes a significant amount of transaction data every month.

During peak hours, the write Queries per Second (QPS) is three times of primary key reads; whilst the range Queries per Second are four times of the primary key reads.

Design goals

From the query and traffic patterns, we arrived at the following three design goals:

  1. Stability – the database solution must be able to handle high read and write QPS. Online order processing queries must have high availability. Even when some part of the system is down, we must be able to provide a degraded experience to the end users allowing them to still be able to create and complete an order.
  2. Scalability and cost – the database solution must be able to support fast evolution of business requirements, given now we handle up to a million orders per month. The solution must also be cost effective at a large scale.
  3. Consistency – strong consistency for transactional queries, and eventually consistency for analytical queries.

Solution

The first design principle towards a stable and scalable database solution is to use different databases to serve transactional and analytical queries, also known as OLTP and OLAP queries. An OLTP database serves queries critical to online order processing. This table keeps data for only a short period of time. Meanwhile, an OLAP database has the same set of data, but serves our historical and statistical queries. This database keeps data for a longer time.

What are the benefits from this design principle? From a stability point of view, we can choose different databases which can better fulfil our different query patterns and QPS requirements. An OLTP database is the single source of truth for online order processing; any failure in the OLAP database will not affect online transactions. From a scalability and cost point of view, we can choose a flexible database for OLAP to support our fast evolution of business requirements. We can maintain less data in our OLTP database while keeping some older data in our OLAP database.

To ensure that the data in both databases are consistent, we introduced the second design principle – data ingestion pipeline. In Figure 1, Order Platform writes data to the OLTP database to process online orders and asynchronously pushes the data into the data ingestion pipeline. The data ingestion pipeline ensures that the OLAP database data is eventually consistent.

Figure 1: Order Platform database solution overview

Architecture details

OLTP database

There are two categories of OLTP queries, the key-value queries (for example, load by order id) and the batch queries (for example, Get ongoing orders by passenger id). We use DynamoDB as the database to support these OLTP queries.

Why DynamoDB?

  1. Scalable and highly available: the tables of DynamoDB are partitioned and each partition is three-way replicated.
  2. Support for strong consistent reads by primary key.
  3. DynamoDB has a mechanism called adaptive capacity to handle hotkey traffic. Internally, DynamoDB will distribute higher capacity to high-traffic partitions, and isolate frequently accessed items to a dedicated partition. This way, the hotkey can utilise the full capacity of an entire partition, which is up to 3000 read capacity units and 1000 write capacity units.
Figure 2: DynamoDB table structure overview. Source: Amazon Web Services (2019, 28 April)

In each DynamoDB table, it has many items with attributes. In each item, it has a partition key and sort key. The partition key is used for key-value queries, and the sort key is used for range queries. In our case, the table contains multiple order items. The partition key is order ID. We can easily support key-value queries by the partition key.

order_id (PK) state pax_id created_at pax_id_gsi
order1 Ongoing Alice 9:00am
order2 Ongoing Alice 9:30am
order3 Completed Alice 8:30am

Batch queries like ‘Get ongoing orders by passenger id’ are supported by DynamoDB Global Secondary Index (GSI). A GSI is like a normal DynamoDB table, which also has keys and attributes.

In our case, we have a GSI table where the partition key is the pax_id_gsi. The attribute pax_id_gsi is linked to the main table. It is eventually consistent with the main table that is maintained by DynamoDB. If the Order Platform queries ongoing orders for Alice, two items will be returned from the GSI table.

pax_id_gsi (PK) created_at (SK) order_id
Alice 9:00am order1
Alice 9:30am order2

We also make use of an advanced feature of GSI named sparse index to support ongoing order queries. When we update order status from ongoing to completed, at the same time, we set the pax_id_gsi to empty, so that the linked item in the GSI will be automatically deleted by DynamoDB. At any time, the GSI table only stores the ongoing orders. We use a sparse index mechanism to control our table size for better performance and to be more cost effective.

The next problem is data retention. This is achieved with the DynamoDB Time To Live (TTL) feature. DynamoDB will auto-scan expired items and delete them. But the challenge is when we add TTL to big tables, it will bring a heavy load to the background scanner and might result in an outage. Our solution is to only add a TTL attribute to the new items in the table. Then, we manually delete the items without TTL attributes, and run a script to delete items with TTL attributes that are too old. After this process, the table size will be quite small, so we can enable the TTL feature on the TTL attribute that we previously added without any concern. The retention period of our DynamoDB data is three months.

Costwise, DynamoDB is charged by storage size and the provision of the read write capability. The provision capability is actually auto scalable. The cost is on-demand. So it’s generally cheaper than RDS.

OLAP database

We use MySQL RDS as the database to support historical and statistical OLAP queries.

Why not Aurora? We choose RDS mainly because it is a mature database solution. Even if Aurora can provide better high-availability, RDS is enough to support our less critical use cases. Costwise, Aurora charges by data storage and the number of requested Input/Output Operations per Second (IOPS). RDS charges only by data storage. As we are using General Purpose (SSD) storage, IOPS is free and supports up to 16k IOPS.

We use MySQL partitioning for data retention. The order table is partitioned by creation time monthly. Since the data access pattern is mostly by month, the partition key can reduce cross-partition queries. Partitions older than six months are dropped at the beginning of each month.

Data ingestion pipeline

Figure 3: Data Ingestion Pipeline Architecture.

A Kafka stream is used to process data in the data ingestion pipeline. We choose the Kafka stream, because it has 99.95% SLA. It is not restricted by the OLTP and OLAP database types.

Even if Kafka can provide 99.95% SLA, there is still the chance of stream producer failures. When the producer fails, we will store the message in an Amazon Simple Queue Service (SQS) and retry. If the retry also fails, it will be moved to the SQS dead letter queue (DLQ), to be consumed at a later time.

On the stream consumer side, we use back-off retry at both stream and database levels to ensure consistency. In a worst-case scenario, we can rewind the stream events from Kafka.

It is important for the data ingestion pipeline to handle duplicate messages and out-of-order messages.

Duplicate messages are handled by the database level unique key (for example, order ID + creation time).

For the out-of-order messages, we implemented the following two mechanisms:

  1. Version update: we only update the most recently updated data. The precision of the update time is in microseconds, which is enough for most of the use cases.
  2. Upsert: if the update events occur before the create events, we simulate an upsert operation.

Impact

After launching our solution this year, we have saved significantly on cloud costs. In the earlier solution, Order Platform synchronously writes to DynamoDB and Aurora and the data is kept forever.

Conclusion

In terms of stability, we use DynamoDB as the critical OLTP database to ensure high availability for online order processing. Scalability wise, we use RDS as the OLAP database to support our quickly evolving business requirements by using a rich, multiple index. Cost efficiency is achieved by data retention in both databases. For consistency, we built a single source of truth OLTP database and an OLAP database that is eventually consistent with the help of the data ingestion pipeline.

What’s next?

Currently, the database solution is running on the production environment. Even though the database solution is proven to be stable, scalable and consistent, we still see some potential areas of improvement.

We use MySQL RDS for OLAP data storage. Even though MySQL is stable and cost effective, it is difficult to serve more complicated queries like free text search. Hence, we plan to explore other NoSQL databases like ElasticSearch.

We hope this post helps you understand how we store Grab orders and fulfil the queries from the Grab Order Platform.

References

Amazon Web Services. (2019, 28 April) Build with DynamoDB: S1 E1 – Intro to Amazon DynamoDB [Video]. YouTube.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!