Tag Archives: Financial Services

AWS Clean Rooms ML helps customers and partners apply ML models without sharing raw data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-ml-helps-customers-and-partners-apply-ml-models-without-sharing-raw-data-preview/

Today, we’re introducing AWS Clean Rooms ML (preview), a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. With this new capability, you can generate predictive insights using ML models while continuing to protect your sensitive data.

During this preview, AWS Clean Rooms ML introduces its first model specialized to help companies create lookalike segments for marketing use cases. With AWS Clean Rooms ML lookalike, you can train your own custom model, and you can invite partners to bring a small sample of their records to collaborate and generate an expanded set of similar records while protecting everyone’s underlying data.

In the coming months, AWS Clean Rooms ML will release a healthcare model. This will be the first of many models that AWS Clean Rooms ML will support next year.

AWS Clean Rooms ML helps you to unlock various opportunities for you to generate insights. For example:

  • Airlines can take signals about loyal customers, collaborate with online booking services, and offer promotions to users with similar characteristics.
  • Auto lenders and car insurers can identify prospective auto insurance customers who share characteristics with a set of existing lease owners.
  • Brands and publishers can model lookalike segments of in-market customers and deliver highly relevant advertising experiences.
  • Research institutions and hospital networks can find candidates similar to existing clinical trial participants to accelerate clinical studies (coming soon).

AWS Clean Rooms ML lookalike modeling helps you apply an AWS managed, ready-to-use model that is trained in each collaboration to generate lookalike datasets in a few clicks, saving months of development work to build, train, tune, and deploy your own model.

How to use AWS Clean Rooms ML to generate predictive insights
Today I will show you how to use lookalike modeling in AWS Clean Rooms ML and assume you have already set up a data collaboration with your partner. If you want to learn how to do that, check out the AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data post.

With your collective data in the AWS Clean Rooms collaboration, you can work with your partners to apply ML lookalike modeling to generate a lookalike segment. It works by taking a small sample of representative records from your data, creating a machine learning (ML) model, then applying the particular model to identify an expanded set of similar records from your business partner’s data.

The following screenshot shows the overall workflow for using AWS Clean Rooms ML.

By using AWS Clean Rooms ML, you don’t need to build complex and time-consuming ML models on your own. AWS Clean Rooms ML trains a custom, private ML model, which saves months of your time while still protecting your data.

Eliminating the need to share data
As ML models are natively built within the service, AWS Clean Rooms ML helps you protect your dataset and customer’s information because you don’t need to share your data to build your ML model.

You can specify the training dataset using the AWS Glue Data Catalog table, which contains user-item interactions.

Under Additional columns to train, you can define numerical and categorical data. This is useful if you need to add more features to your dataset, such as the number of seconds spent watching a video, the topic of an article, or the product category of an e-commerce item.

Applying custom-trained AWS-built models
Once you have defined your training dataset, you can now create a lookalike model. A lookalike model is a machine learning model used to find similar profiles in your partner’s dataset without either party having to share their underlying data with each other.

When creating a lookalike model, you need to specify the training dataset. From a single training dataset, you can create many lookalike models. You also have the flexibility to define the date window in your training dataset using Relative range or Absolute range. This is useful when you have data that is constantly updated within AWS Glue, such as articles read by users.

Easy-to-tune ML models
After you create a lookalike model, you need to configure it to use in AWS Clean Rooms collaboration. AWS Clean Rooms ML provides flexible controls that enable you and your partners to tune the results of the applied ML model to garner predictive insights.

On the Configure lookalike model page, you can choose which Lookalike model you want to use and define the Minimum matching seed size you need. This seed size defines the minimum number of profiles in your seed data that overlap with profiles in the training data.

You also have the flexibility to choose whether the partner in your collaboration receives metrics in Metrics to share with other members.

With your lookalike models properly configured, you can now make the ML models available for your partners by associating the configured lookalike model with a collaboration.

Creating lookalike segments
Once the lookalike models have been associated, your partners can now start generating insights by selecting Create lookalike segment and choosing the associated lookalike model for your collaboration.

Here on the Create lookalike segment page, your partners need to provide the Seed profiles. Examples of seed profiles include your top customers or all customers who purchased a specific product. The resulting lookalike segment will contain profiles from the training data that are most similar to the profiles from the seed.

Lastly, your partner will get the Relevance metrics as the result of the lookalike segment using the ML models. At this stage, you can use the Score to make a decision.

Export data and use programmatic API
You also have the option to export the lookalike segment data. Once it’s exported, the data is available in JSON format and you can process this output by integrating with AWS Clean Rooms API and your applications.

Join the preview
AWS Clean Rooms ML is now in preview and available via AWS Clean Rooms in US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London). Support for additional models is in the works.

Learn how to apply machine learning with your partners without sharing underlying data on the AWS Clean Rooms ML page.

Happy collaborating!
— Donnie

Prepare your AWS workloads for the “Operational risks and resilience – banks” FINMA Circular

Post Syndicated from Margo Cronin original https://aws.amazon.com/blogs/security/prepare-your-aws-workloads-for-the-operational-risks-and-resilience-banks-finma-circular/

In December 2022, FINMA, the Swiss Financial Market Supervisory Authority, announced a fully revised circular called Operational risks and resilience – banks that will take effect on January 1, 2024. The circular will replace the Swiss Bankers Association’s Recommendations for Business Continuity Management (BCM), which is currently recognized as a minimum standard. The new circular also adopts the revised principles for managing operational risks, and the new principles on operational resilience, that the Basel Committee on Banking Supervision published in March 2021.

In this blog post, we share key considerations for AWS customers and regulated financial institutions to help them prepare for, and align to, the new circular.

AWS previously announced the publication of the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland. The guide refers to certain rules applicable to financial institutions in Switzerland, including banks, insurance companies, stock exchanges, securities dealers, portfolio managers, trustees, and other financial entities that FINMA oversees (directly or indirectly).

FINMA has previously issued the following circulars to help regulated financial institutions understand approaches to due diligence, third party management, and key technical and organizational controls to be implemented in cloud outsourcing arrangements, particularly for material workloads:

  • 2018/03 FINMA Circular Outsourcing – banks and insurers (31.10.2019)
  • 2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Principal 4 Technology Infrastructure
  • 2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Appendix 3 Handling of electronic Client Identifying Data
  • 2013/03 Auditing (04.11.2020) – Information Technology (21.04.2020)
  • BCM minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

Operational risk management: Critical data

The circular defines critical data as follows:

“Critical data are data that, in view of the institution’s size, complexity, structure, risk profile and business model, are of such crucial significance that they require increased security measures. These are data that are crucial for the successful and sustainable provision of the institution’s services or for regulatory purposes. When assessing and determining the criticality of data, the confidentiality as well as the integrity and availability must be taken into account. Each of these three aspects can determine whether data is classified as critical.”

This definition is consistent with the AWS approach to privacy and security. We believe that for AWS to realize its full potential, customers must have control over their data. This includes the following commitments:

  • Control over the location of your data
  • Verifiable control over data access
  • Ability to encrypt everything everywhere
  • Resilience of AWS

These commitments further demonstrate our dedication to securing your data: it’s our highest priority. We implement rigorous contractual, technical, and organizational measures to help protect the confidentiality, integrity, and availability of your content regardless of which AWS Region you select. You have complete control over your content through powerful AWS services and tools that you can use to determine where to store your data, how to secure it, and who can access it.

You also have control over the location of your content on AWS. For example, in Europe, at the time of publication of this blog post, customers can deploy their data into any of eight Regions (for an up-to-date list of Regions, see AWS Global Infrastructure). One of these Regions is the Europe (Zurich) Region, also known by its API name ‘eu-central-2’, which customers can use to store data in Switzerland. Additionally, Swiss customers can rely on the terms of the AWS Swiss Addendum to the AWS Data Processing Addendum (DPA), which applies automatically when Swiss customers use AWS services to process personal data under the new Federal Act on Data Protection (nFADP).

AWS continually monitors the evolving privacy, regulatory, and legislative landscape to help identify changes and determine what tools our customers might need to meet their compliance requirements. Maintaining customer trust is an ongoing commitment. We strive to inform you of the privacy and security policies, practices, and technologies that we’ve put in place. Our commitments, as described in the Data Privacy FAQ, include the following:

  • Access – As a customer, you maintain full control of your content that you upload to the AWS services under your AWS account, and responsibility for configuring access to AWS services and resources. We provide an advanced set of access, encryption, and logging features to help you do this effectively (for example, AWS Identity and Access ManagementAWS Organizations, and AWS CloudTrail). We provide APIs that you can use to configure access control permissions for the services that you develop or deploy in an AWS environment. We never use your content or derive information from it for marketing or advertising purposes.
  • Storage – You choose the AWS Regions in which your content is stored. You can replicate and back up your content in more than one Region. We will not move or replicate your content outside of your chosen AWS Regions except as agreed with you.
  • Security – You choose how your content is secured. We offer you industry-leading encryption features to protect your content in transit and at rest, and we provide you with the option to manage your own encryption keys. These data protection features include:
  • Disclosure of customer content – We will not disclose customer content unless we’re required to do so to comply with the law or a binding order of a government body. If a governmental body sends AWS a demand for your customer content, we will attempt to redirect the governmental body to request that data directly from you. If compelled to disclose your customer content to a governmental body, we will give you reasonable notice of the demand to allow the customer to seek a protective order or other appropriate remedy, unless AWS is legally prohibited from doing so.
  • Security assurance – We have developed a security assurance program that uses current recommendations for global privacy and data protection to help you operate securely on AWS, and to make the best use of our security control environment. These security protections and control processes are independently validated by multiple third-party independent assessments, including the FINMA International Standard on Assurance Engagements (ISAE) 3000 Type II attestation report.

Additionally, FINMA guidelines lay out requirements for the written agreement between a Swiss financial institution and its service provider, including access and audit rights. For Swiss financial institutions that run regulated workloads on AWS, we offer the Swiss Financial Services Addendum to address the contractual and audit requirements of the FINMA guidelines. We also provide these institutions the ability to comply with the audit requirements in the FINMA guidelines through the AWS Security & Audit Series, including participation in an Audit Symposium, to facilitate customer audits. To help align with regulatory requirements and expectations, our FINMA addendum and audit program incorporate feedback that we’ve received from a variety of financial supervisory authorities across EU member states. To learn more about the Swiss Financial Services addendum or about the audit engagements offered by AWS, reach out to your AWS account team.


Customers need control over their workloads and high availability to help prepare for events such as supply chain disruptions, network interruptions, and natural disasters. Each AWS Region is composed of multiple Availability Zones (AZs). An Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. To better isolate issues and achieve high availability, you can partition applications across multiple AZs in the same Region. If you are running workloads on premises or in intermittently connected or remote use cases, you can use our services that provide specific capabilities for offline data and remote compute and storage. We will continue to enhance our range of sovereign and resilient options, to help you sustain operations through disruption or disconnection.

FINMA incorporates the principles of operational resilience in the newest circular 2023/01. In line with the efforts of the European Commission’s proposal for the Digital Operational Resilience Act (DORA), FINMA outlines requirements for regulated institutions to identify critical functions and their tolerance for disruption. Continuity of service, especially for critical economic functions, is a key prerequisite for financial stability. AWS recognizes that financial institutions need to comply with sector-specific regulatory obligations and requirements regarding operational resilience. AWS has published the whitepaper Amazon Web Services’ Approach to Operational Resilience in the Financial Sector and Beyond, in which we discuss how AWS and customers build for resiliency on the AWS Cloud. AWS provides resilient infrastructure and services, which financial institution customers can rely on as they design their applications to align with FINMA regulatory and compliance obligations.

AWS previously announced the third issuance of the FINMA ISAE 3000 Type II attestation report. Customers can access the entire report in AWS Artifact. To learn more about the list of certified services and Regions, see the FINMA ISAE 3000 Type 2 Report and AWS Services in Scope for FINMA.

AWS is committed to adding new services into our future FINMA program scope based on your architectural and regulatory needs. If you have questions about the FINMA report, or how your workloads on AWS align to the FINMA obligations, contact your AWS account team. We will also help support customers as they look for new ways to experiment, remain competitive, meet consumer expectations, and develop new products and services on AWS that align with the new regulatory framework.

To learn more about our compliance, security programs and common privacy and data protection considerations, see AWS Compliance Programs and the dedicated AWS Compliance Center for Switzerland. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security, Identity, & Compliance re:Post or contact AWS Support.

Margo Cronin

Margo Cronin

Margo is an EMEA Principal Solutions Architect specializing in security and compliance. She is based out of Zurich, Switzerland. Her interests include security, privacy, cryptography, and compliance. She is passionate about her work unblocking security challenges for AWS customers, enabling their successful cloud journeys. She is an author of AWS User Guide to Financial Services Regulations and Guidelines in Switzerland.

Raphael Fuchs

Raphael Fuchs

Raphael is a Senior Security Solutions Architect based in Zürich, Switzerland, who helps AWS Financial Services customers meet their security and compliance objectives in the AWS Cloud. Raphael has a background as Chief Information Security Officer in the Swiss FSI sector and is an author of AWS User Guide to Financial Services Regulations and Guidelines in Switzerland.

New – Move Payment Processing to the Cloud with AWS Payment Cryptography

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-move-payment-processing-to-the-cloud-with-aws-payment-cryptography/

Cryptography is everywhere in our daily lives. If you’re reading this blog, you’re using HTTPS, an extension of HTTP that uses encryption to secure communications. On AWS, multiple services and capabilities help you manage keys and encryption, such as:

HSMs are physical devices that securely protect cryptographic operations and the keys used by these operations. HSMs can help you meet your corporate, contractual, and regulatory compliance requirements. With CloudHSM, you have access to general-purpose HSMs. When payments are involved, there are specific payment HSMs that offer capabilities such as generating and validating the personal identification number (PIN) and the security code of a credit or debit card.

Today, I am happy to share the availability of AWS Payment Cryptography, an elastic service that manages payment HSMs and keys for payment processing applications in the cloud.

Applications using payments HSMs have challenging requirements because payment processing is complex, time sensitive, and highly regulated and requires the interaction of multiple financial service providers and payment networks. Every time you make a payment, data is exchanged between two or more financial service providers and must be decrypted, transformed, and encrypted again with a unique key at each step.

This process requires highly performant cryptography capabilities and key management procedures between each payment service provider. These providers might have thousands of keys to protect, manage, rotate, and audit, making the overall process expensive and difficult to scale. To add to that, payment HSMs historically employ complex and error-prone processes, such as exchanging keys in a secure room using multiple hand-carried paper forms, each with separate key components printed on them.

Introducing AWS Payment Cryptography
AWS Payment Cryptography simplifies your implementation of cryptographic functions and key management used to secure data in payment processing in accordance with various payment card industry (PCI) standards.

With AWS Payment Cryptography, you can eliminate the need to provision and manage on-premises payment HSMs and use the provided tools to avoid error-prone key exchange processes. For example, with AWS Payment Cryptography, payment and financial service providers can begin development within minutes and plan to exchange keys electronically, eliminating manual processes.

To provide its elastic cryptographic capabilities in a compliant manner, AWS Payment Cryptography uses HSMs with PCI PTS HSM device approval. These capabilities include encryption and decryption of card data, key creation, and pin translation. AWS Payment Cryptography is also designed in accordance with PCI security standards such as PCI DSS, PCI PIN, and PCI P2PE, and it provides evidence and reporting to help meet your compliance needs.

You can import and export symmetric keys between AWS Payment Cryptography and on-premises HSMs under key encryption key (KEKs) using the ANSI X9 TR-31 protocol. You can also import and export symmetric KEKs with other systems and devices using the ANSI X9 TR-34 protocol, which allows the service to exchange symmetric keys using asymmetric techniques.

To simplify moving consumer payment processing to the cloud, existing card payment applications can use AWS Payment Cryptography through the AWS SDKs. In this way, you can use your favorite programming language, such as Java or Python, instead of vendor-specific ASCII interfaces over TCP sockets, as is common with payment HSMs.

Access can be authorized using AWS Identity and Access Management (IAM) identity-based policies, where you can specify which actions and resources are allowed or denied and under which conditions.

Monitoring is important to maintain the reliability, availability, and performance needed by payment processing. With AWS Payment Cryptography, you can use Amazon CloudWatch, AWS CloudTrail, and Amazon EventBridge to understand what is happening, report when something is wrong, and take automatic actions when appropriate.

Let’s see how this works in practice.

Using AWS Payment Cryptography
Using the AWS Command Line Interface (AWS CLI), I create a double-length 3DES key to be used as a card verification key (CVK). A CVK is a key used for generating and verifying card security codes such as CVV, CVV2, and similar values.

Note that there are two commands for the CLI (and similarly two endpoints for API and SDKs):

  • payment-cryptography for control plane operation such as listing and creating keys and aliases.
  • payment-cryptography-data for cryptographic operations that use keys, for example, to generate PIN or card validation data.

Creating a key is a control plane operation:

aws payment-cryptography create-key \
    --no-exportable \
    --key-attributes KeyAlgorithm=TDES_2KEY,
    "Key": {
        "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
        "KeyAttributes": {
            "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
            "KeyClass": "SYMMETRIC_KEY",
            "KeyAlgorithm": "TDES_2KEY",
            "KeyModesOfUse": {
                "Encrypt": false,
                "Decrypt": false,
                "Wrap": false,
                "Unwrap": false,
                "Generate": true,
                "Sign": false,
                "Verify": true,
                "DeriveKey": false,
                "NoRestrictions": false
        "KeyCheckValue": "B2DD4E",
        "KeyCheckValueAlgorithm": "ANSI_X9_24",
        "Enabled": true,
        "Exportable": false,
        "KeyState": "CREATE_COMPLETE",
        "KeyOrigin": "AWS_PAYMENT_CRYPTOGRAPHY",
        "CreateTimestamp": "2023-05-26T14:25:48.240000+01:00",
        "UsageStartTimestamp": "2023-05-26T14:25:48.220000+01:00"

To reference this key in the next steps, I can use the Amazon Resource Name (ARN) as found in the KeyARN property, or I can create an alias. An alias is a friendly name that lets me refer to a key without having to use the full ARN. I can update an alias to refer to a different key. When I need to replace a key, I can just update the alias without having to change the configuration or the code of your applications. To be recognized easily, alias names start with alias/. For example, the following command creates the alias alias/my-key for the key I just created:

aws payment-cryptography create-alias --alias-name alias/my-key \
    --key-arn arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h
    "Alias": {
        "AliasName": "alias/my-key",
        "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h"

Before I start using the new key, I list all my keys to check their status:

aws payment-cryptography list-keys
    "Keys": [
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123421341234:key/42cdc4ocf45mg54h",
            "KeyAttributes": {
                "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
                "KeyClass": "SYMMETRIC_KEY",
                "KeyAlgorithm": "TDES_2KEY",
                "KeyModesOfUse": {
                    "Encrypt": false,
                    "Decrypt": false,
                    "Wrap": false,
                    "Unwrap": false,
                    "Generate": true,
                    "Sign": false,
                    "Verify": true,
                    "DeriveKey": false,
                    "NoRestrictions": false
            "KeyCheckValue": "B2DD4E",
            "Enabled": true,
            "Exportable": false,
            "KeyState": "CREATE_COMPLETE"
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/ok4oliaxyxbjuibp",
            "KeyAttributes": {
                "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
                "KeyClass": "SYMMETRIC_KEY",
                "KeyAlgorithm": "TDES_2KEY",
                "KeyModesOfUse": {
                    "Encrypt": false,
                    "Decrypt": false,
                    "Wrap": false,
                    "Unwrap": false,
                    "Generate": true,
                    "Sign": false,
                    "Verify": true,
                    "DeriveKey": false,
                    "NoRestrictions": false
            "KeyCheckValue": "905848",
            "Enabled": true,
            "Exportable": false,
            "KeyState": "DELETE_PENDING"

As you can see, there is another key I created before, which has since been deleted. When a key is deleted, it is marked for deletion (DELETE_PENDING). The actual deletion happens after a configurable period (by default, 7 days). This is a safety mechanism to prevent the accidental or malicious deletion of a key. Keys marked for deletion are not available for use but can be restored.

In a similar way, I list all my aliases to see to which keys they are they referring:

aws payment-cryptography list-aliases
    "Aliases": [
            "AliasName": "alias/my-key",
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h"

Now, I use the key to generate a card security code with the CVV2 authentication system. You might be familiar with CVV2 numbers that are usually written on the back of a credit card. This is the way they are computed. I provide as input the primary account number of the credit card, the card expiration date, and the key from the previous step. To specify the key, I use its alias. This is a data plane operation:

aws payment-cryptography-data generate-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --generation-attributes CardVerificationValue2={CardExpiryDate=0124}
    "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
    "KeyCheckValue": "B2DD4E",
    "ValidationData": "343"

I take note of the three digits in the ValidationData property. When processing a payment, I can verify that the card data value is correct:

aws payment-cryptography-data verify-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --verification-attributes CardVerificationValue2={CardExpiryDate=0124} \
    --validation-data 343
    "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
    "KeyCheckValue": "B2DD4E"

The verification is successful, and in return I get back the same KeyCheckValue as when I generated the validation data.

As you might expect, if I use the wrong validation data, the verification is not successful, and I get back an error:

aws payment-cryptography-data verify-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --verification-attributes CardVerificationValue2={CardExpiryDate=0124} \
    --validation-data 999

An error occurred (com.amazonaws.paymentcryptography.exception#VerificationFailedException)
when calling the VerifyCardValidationData operation:
Card validation data verification failed

In the AWS Payment Cryptography console, I choose View Keys to see the list of keys.

Console screenshot.

Optionally, I can enable more columns, for example, to see the key type (symmetric/asymmetric) and the algorithm used.

Console screenshot.

I choose the key I used in the previous example to get more details. Here, I see the cryptographic configuration, the tags assigned to the key, and the aliases that refer to this key.

Console screenshot.

AWS Payment Cryptography supports many more operations than the ones I showed here. For this walkthrough, I used the AWS CLI. In your applications, you can use AWS Payment Cryptography through any of the AWS SDKs.

Availability and Pricing
AWS Payment Cryptography is available today in the following AWS Regions: US East (N. Virginia) and US West (Oregon).

With AWS Payment Cryptography, you only pay for what you use based on the number of active keys and API calls with no up-front commitment or minimum fee. For more information, see AWS Payment Cryptography pricing.

AWS Payment Cryptography removes your dependencies on dedicated payment HSMs and legacy key management systems, simplifying your integration with AWS native APIs. In addition, by operating the entire payment application in the cloud, you can minimize round-trip communications and latency.

Move your payment processing applications to the cloud with AWS Payment Cryptography.


Patterns for enterprise data sharing at scale

Post Syndicated from Venkata Sistla original https://aws.amazon.com/blogs/big-data/patterns-for-enterprise-data-sharing-at-scale/

Data sharing is becoming an important element of an enterprise data strategy. AWS services like AWS Data Exchange provide an avenue for companies to share or monetize their value-added data with other companies. Some organizations would like to have a data sharing platform where they can establish a collaborative and strategic approach to exchange data with a restricted group of companies in a closed, secure, and exclusive environment. For example, financial services companies and their auditors, or manufacturing companies and their supply chain partners. This fosters development of new products and services and helps improve their operational efficiency.

Data sharing is a team effort, it’s important to note that in addition to establishing the right infrastructure, successful data sharing also requires organizations to ensure that business owners sponsor data sharing initiatives. They also need to ensure that data is of high quality. Data platform owners and security teams should encourage proper data use and fix any privacy and confidentiality issues.

This blog discusses various data sharing options and common architecture patterns that organizations can adopt to set up their data sharing infrastructure based on AWS service availability and data compliance.

Data sharing options and data classification types

Organizations operate across a spectrum of security compliance constraints. For some organizations, it’s possible to use AWS services like AWS Data Exchange. However, organizations working in heavily regulated industries like federal agencies or financial services might be limited by the allow listed AWS service options. For example, if an organization is required to operate in a Fedramp Medium or Fedramp High environment, their options to share data may be limited by the AWS services that are available and have been allow listed. Service availability is based on platform certification by AWS, and allow listing is based on the organizations defining their security compliance architecture and guidelines.

The kind of data that the organization wants to share with its partners may also have an impact on the method used for data sharing. Complying with data classification rules may further limit their choice of data sharing options they may choose.

The following are some general data classification types:

  • Public data – Important information, though often freely available for people to read, research, review and store. It typically has the lowest level of data classification and security.
  • Private data – Information you might want to keep private like email inboxes, cell phone content, employee identification numbers, or employee addresses. If private data were shared, destroyed, or altered, it might pose a slight risk to an individual or the organization.
  • Confidential or restricted data – A limited group of individuals or parties can access sensitive information often requiring special clearance or special authorization. Confidential or restricted data access might involve aspects of identity and authorization management. Examples of confidential data include Social Security numbers and vehicle identification numbers.

The following is a sample decision tree that you can refer to when choosing your data sharing option based on service availability, classification type, and data format (structured or unstructured). Other factors like usability, multi-partner accessibility, data size, consumption patterns like bulk load/API access, and more may also affect the choice of data sharing pattern.


In the following sections, we discuss each pattern in more detail.

Pattern 1: Using AWS Data Exchange

AWS Data Exchange makes exchanging data easier, helping organizations lower costs, become more agile, and innovate faster. Organizations can choose to share data privately using AWS Data Exchange with their external partners. AWS Data Exchange offers perimeter controls that are applied at identity and resource levels. These controls decide which external identities have access to specific data resources. AWS Data Exchange provides multiple different patterns for external parties to access data, such as the following:

The following diagram illustrates an example architecture.


With AWS Data Exchange, once the dataset to share (or sell) is configured, AWS Data Exchange automatically manages entitlements (and billing) between the producer and the consumer. The producer doesn’t have to manage policies, set up new access points, or create new Amazon Redshift data shares for each consumer, and access is automatically revoked if the subscription ends. This can significantly reduce the operational overhead in sharing data.

Pattern 2: Using AWS Lake Formation for centralized access management

You can use this pattern in cases where both the producer and consumer are on the AWS platform with an AWS account that is enabled to use AWS Lake Formation. This pattern provides a no-code approach to data sharing. The following diagram illustrates an example architecture.


In this pattern, the central governance account has Lake Formation configured for managing access across the producer’s org accounts. Resource links from the production account Amazon Simple Storage Service (Amazon S3) bucket are created in Lake Formation. The producer grants Lake Formation permissions on an AWS Glue Data Catalog resource to an external account, or directly to an AWS Identity and Access Management (IAM) principal in another account. Lake Formation uses AWS Resource Access Manager (AWS RAM) to share the resource. If the grantee account is in the same organization as the grantor account, the shared resource is available immediately to the grantee. If the grantee account is not in the same organization, AWS RAM sends an invitation to the grantee account to accept or reject the resource grant. To make the shared resource available, the consumer administrator in the grantee account must use the AWS RAM console or AWS Command Line Interface (AWS CLI) to accept the invitation.

Authorized principals can share resources explicitly with an IAM principal in an external account. This feature is useful when the producer wants to have control over who in the external account can access the resources. The permissions the IAM principal receives are a union of direct grants and the account-level grants that are cascaded down to the principals. The data lake administrator of the recipient account can view the direct cross-account grants, but can’t revoke permissions.

Pattern 3: Using AWS Lake Formation from the producer external sharing account

The producer may have stringent security requirements where no external consumer should access their production account or their centralized governance account. They may also not have Lake Formation enabled on their production platform. In such cases, as shown in the following diagram, the producer production account (Account A) is dedicated to its internal organization users. The producer creates another account, the producer external sharing account (Account B), which is dedicated for external sharing. This gives the producer more latitude to create specific policies for specific organizations.

The following architecture diagram shows an overview of the pattern.


The producer implements a process to create an asynchronous copy of data in Account B. The bucket can be configured for Same Region Replication (SRR) or Cross Region Replication (CRR) for objects that need to be shared. This facilitates automated refresh of data to the external account to the “External Published Datasets” S3 bucket without having to write any code.

Creating a copy of the data allows the producer to add another degree of separation between the external consumer and its production data. It also helps meet any compliance or data sovereignty requirements.

Lake Formation is set up on Account B, and the administrator creates resources links for the “External Published Datasets” S3 bucket in its account to grant access. The administrator follows the same process to grant access as described earlier.

Pattern 4: Using Amazon Redshift data sharing

This pattern is ideally suited for a producer who has most of their published data products on Amazon Redshift. This pattern also requires the producer’s external sharing account (Account B) and the consumer account (Account C) to have an encrypted Amazon Redshift cluster or Amazon Redshift Serverless endpoint that meets the prerequisites for Amazon Redshift data sharing.

The following architecture diagram shows an overview of the pattern.


Two options are possible depending on the producer’s compliance constraints:

  • Option A – The producer enables data sharing directly on the production Amazon Redshift cluster.
  • Option B – The producer may have constraints with respect to sharing the production cluster. The producer creates a simple AWS Glue job that copies data from the Amazon Redshift cluster in the production Account A to the Amazon Redshift cluster in the external Account B. This AWS Glue job can be scheduled to refresh data as needed by the consumer. When the data is available in Account B, the producer can create multiple views and multiple data shares as needed.

In both options, the producer maintains complete control over what data is being shared, and the consumer admin maintains full control over who can access the data within their organization.

After both the producer and consumer admins approve the data sharing request, the consumer user can access this data as if it were part of their own account without have to write any additional code.

Pattern 5: Sharing data securely and privately using APIs

You can adopt this pattern when the external partner doesn’t have a presence on AWS. You can also use this pattern when published data products are spread across various services like Amazon S3, Amazon Redshift, Amazon DynamoDB, and Amazon OpenSearch Service but the producer would like to maintain a single data sharing interface.

Here’s an example use case: Company A would like to share some of its log data in near-real time with its partner Company B, who uses this data to generate predictive insights for Company A. Company A stores this data in Amazon Redshift. The company wants to share this transactional information with its partner after masking the personally identifiable information (PII) in a cost-effective and secure way to generate insights. Company B doesn’t use the AWS platform.

Company A establishes a microbatch process using an AWS Lambda function or AWS Glue that queries Amazon Redshift to get incremental log data, applies the rules to redact the PII, and loads this data to the “Published Datasets” S3 bucket. This instantiates an SRR/CRR process that refreshes this data in the “External Sharing” S3 bucket.

The following diagram shows how the consumer can then use an API-based approach to access this data.


The workflow contains the following steps:

  1. An HTTPS API request is sent from the API consumer to the API proxy layer.
  2. The HTTPS API request is forwarded from the API proxy to Amazon API Gateway in the external sharing AWS account.
  3. Amazon API Gateway calls the request receiver Lambda function.
  4. The request receiver function writes the status to a DynamoDB control table.
  5. A second Lambda function, the poller, checks the status of the results in the DynamoDB table.
  6. The poller function fetches results from Amazon S3.
  7. The poller function sends a presigned URL to download the file from the S3 bucket to the requestor via Amazon Simple Email Service (Amazon SES).
  8. The requestor downloads the file using the URL.
  9. The network perimeter AWS account only allows egress internet connection.
  10. The API proxy layer enforces both the egress security controls and perimeter firewall before the traffic leaves the producer’s network perimeter.
  11. The AWS Transit Gateway security egress VPC routing table only allows connectivity from the required producer’s subnet, while preventing internet access.

Pattern 6: Using Amazon S3 access points

Data scientists may need to work collaboratively on image, videos, and text documents. Legal and audit groups may want to share reports and statements with the auditing agencies. This pattern discusses an approach to sharing such documents. The pattern assumes that the external partners are also on AWS. Amazon S3 access points allow the producer to share access with their consumer by setting up cross-account access without having to edit bucket policies.

Access points are named network endpoints that are attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject. Each access point has distinct permissions and network controls that Amazon S3 applies for any request that is made through that access point. Each access point enforces a customized access point policy that works in conjunction with the bucket policy attached to the underlying bucket.

The following architecture diagram shows an overview of the pattern.


The producer creates an S3 bucket and enables the use of access points. As part of the configuration, the producer specifies the consumer account, IAM role, and privileges for the consumer IAM role.

The consumer users with the IAM role in the consumer account can access the S3 bucket via the internet or restricted to an Amazon VPC via VPC endpoints and AWS PrivateLink.


Each organization has its unique set of constraints and requirements that it needs to fulfill to set up an efficient data sharing solution. In this post, we demonstrated various options and best practices available to organizations. The data platform owner and security team should work together to assess what works best for your specific situation. Your AWS account team is also available to help.

Related resources

For more information on related topics, refer to the following:

About the Authors

Venkata Sistla
is a Cloud Architect – Data & Analytics at AWS. He specializes in building data processing capabilities and helping customers remove constraints that prevent them from leveraging their data to develop business insights.

Santosh Chiplunkar is a Principal Resident Architect at AWS. He has over 20 years of experience helping customers solve their data challenges. He helps customers develop their data and analytics strategy and provides them with guidance on how to make it a reality.

2022 FINMA ISAE 3000 Type II attestation report now available with 154 services in scope

Post Syndicated from Daniel Fuertes original https://aws.amazon.com/blogs/security/2022-finma-isae-3000-type-ii-attestation-report-now-available-with-154-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the third issuance of the Swiss Financial Market Supervisory Authority (FINMA) International Standard on Assurance Engagements (ISAE) 3000 Type II attestation report. The scope of the report covers a total of 154 services and 24 global AWS Regions.

The latest FINMA ISAE 3000 Type II report covers the period from October 1, 2021, to September 30, 2022. AWS continues to assure Swiss financial industry customers that our control environment is capable of effectively addressing key operational, outsourcing, and business continuity management risks.

FINMA circulars

The report covers the five core FINMA circulars regarding outsourcing arrangements to the cloud. FINMA circulars help Swiss-regulated financial institutions to understand the approaches FINMA takes when implementing due diligence, third-party management, and key technical and organizational controls for cloud outsourcing arrangements, particularly for material workloads.

The scope of the report covers the following requirements of the FINMA circulars:

  • 2018/03 Outsourcing – Banks, insurance companies and selected financial institutions under FinIA
  • 2008/21 Operational Risks – Banks – Principle 4 Technology Infrastructure (31.10.2019)
  • 2008/21 Operational Risks – Banks – Appendix 3 Handling of Electronic Client Identifying Data (31.10.2019)
  • 2013/03 Auditing – Information Technology (04.11.2020)
  • 2008/10 Self-regulation as a minimum standard – Minimum Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

It is our pleasure to announce the addition of 16 services and two Regions to the FINMA ISAE 3000 Type II attestation scope. The following are a few examples of the additional security services in scope:

  • AWS CloudShell – A browser-based shell that makes it simple to manage, explore, and interact with your AWS resources. With CloudShell, you can quickly run scripts with the AWS Command Line Interface (AWS CLI), experiment with AWS service APIs by using the AWS SDKs, or use a range of other tools to be productive.
  • Amazon HealthLake – A HIPAA-eligible service that offers healthcare and life sciences companies a chronological view of individual or patient population health data for query and analytics at scale.
  • AWS IoT SiteWise – A managed service that simplifies collecting, organizing, and analyzing industrial equipment data.
  • Amazon DevOps Guru – A service that uses machine learning to detect abnormal operating patterns to help you identify operational issues before they impact your customers.

Customers can continue to reference the FINMA workbooks, which include detailed control mappings for each FINMA circular covered under this audit report, through AWS Artifact. Customers can also find the entire FINMA report on AWS Artifact. To learn more about the list of certified services and Regions, see AWS Compliance Programs and AWS Services in Scope for FINMA.

As always, AWS is committed to adding new services into our future FINMA program scope based on your architectural and regulatory needs. If you have questions about the FINMA report, contact your AWS account team.

If you have feedback about this post, please submit them in the Comments section below.
Want more AWS Security news? Follow us on Twitter.


Daniel Fuertes

Daniel is a Security Audit Program Manager at AWS based in Madrid, Spain. Daniel leads multiple security audits, attestations, and certification programs in Spain and other EMEA countries. Daniel has 8 years of experience in security assurance and previously worked as an auditor for PCI DSS security framework.

How AWS Data Lab helped BMW Financial Services design and build a multi-account modern data architecture

Post Syndicated from Rahul Shaurya original https://aws.amazon.com/blogs/big-data/how-aws-data-lab-helped-bmw-financial-services-design-and-build-a-multi-account-modern-data-architecture/

This post is co-written by Martin Zoellner, Thomas Ehrlich and Veronika Bogusch from BMW Group.

BMW Group and AWS announced a comprehensive strategic collaboration in 2020. The goal of the collaboration is to further accelerate BMW Group’s pace of innovation by placing data and analytics at the center of its decision-making. A key element of the collaboration is the further development of the Cloud Data Hub (CDH) of BMW Group. This is the central platform for managing company-wide data and data solutions in the cloud. At the AWS re:Invent 2019 session, BMW and AWS demonstrated the new Cloud Data Hub platform by outlining different archetypes of data platforms and then walking through the journey of building BMW Group’s Cloud Data Hub. To learn more about the Cloud Data Hub, refer to BMW Cloud Data Hub: A reference implementation of the modern data architecture on AWS.

As part of this collaboration, BMW Group is migrating hundreds of data sources across several data domains to the Cloud Data Hub. Several of these sources pertain to BMW Financial Services.

In this post, we talk about how the AWS Data Lab is helping BMW Financial Services build a regulatory reporting application for one of the European BMW market using the Cloud Data Hub on AWS.

Solution overview

In the context of regulatory reporting, BMW Financial Services works with critical financial services data that contains personally identifiable information (PII). We need to provide monthly insights on our financial data to one of the European National Regulator, and we also need to be compliant with the Schrems II and GDPR regulations as we process PII data. This requires the PII to be pseudonymized when it’s loaded into the Cloud Data Hub, and it has to be processed further in pseudonymized form. For an overview of pseudonymization process, check out Build a pseudonymization service on AWS to protect sensitive data .

To address these requirements in a precise and efficient way, BMW Financial Services decided to engage with the AWS Data Lab. The AWS Data Lab has two offerings: the Design Lab and the Build Lab.

Design Lab

The Design Lab is a 1-to-2-day engagement for customers who need a real-world architecture recommendation based on AWS expertise, but aren’t ready to build. In the case of BMW Financial Services, before beginning the build phase, it was key to get all the stakeholders in the same room and record all the functional and non-functional requirements introduced by all the different parties that might influence the data platform—from owners of the various data sources to end-users that would use the platform to run analytics and get business insights. Within the scope of the Design Lab, we discussed three use cases:

  • Regulatory reporting – The top priority for BMW Financial Services was the regulatory reporting use case, which involves collecting and calculating data and reports that will be declared to the National Regulator.
  • Local data warehouse – For this use case, we need to calculate and store all key performance indicators (KPIs) and key value indicators (KVIs) that will be defined during the project. The historical data needs to be stored, but we need to apply a pseudonymization process to respect GDPR directives. Moreover, historical data has to be accessed on a daily basis through a tableau visualization tool. Regarding the structure, it would be valuable to define two levels (at minimum): one at the contract level to justify the calculation of all KPIs, and another at an aggregated level to optimize restitutions. Personal data is limited in the application, but a reidentification process must be possible for authorized consumption patterns.
  • Accounting details – This use case is based on the BMW accounting tool IFT, which provides the accounting balance at the contract level from all local market applications. It must run at least once a month. However, if some issues are identified on IFT during closing, we must be able to restart it and erase the previous run. When the month-end closing is complete, this use case has to keep the last accounting balance version generated during the month and store it. In parallel, all accounting balance versions have to be accessible by other applications for queries and be able to retrieve the information for 24 months.

Design Lab Solution Architecture

Based on these requirements, we developed the following architecture during the Design Lab.

This solution contains the following components:

  1. The main data source that hydrates our three use cases is the already available in the Cloud Data Hub. The Cloud Data Hub uses AWS Lake Formation resource links to grant access to the dataset to the consumer accounts.
  2. For standard, periodic ETL (extract, transform, and load) jobs that involve operations such as converting data types, or creating labels based on numerical values or Boolean flags based on a label, we used AWS Glue ETL jobs.
  3. For historical ETL jobs or more complex calculations such as in the account details use case, which may involve huge joins with custom configurations and tuning, we recommended to use Amazon EMR. This gives you the opportunity to control cluster configurations at a fine-grained level.
  4. To store job metadata that enables features such as reprocessing inputs or rerunning failed jobs, we recommended building a data registry. The goal of the data registry is to create a centralized inventory for any data being ingested in the data lake. A schedule-based AWS Lambda function could be triggered to register data landing on the semantic layer of the Cloud Data Hub in a centralized metadata store. We recommended using Amazon DynamoDB for the data registry.
  5. Amazon Simple Storage Service (Amazon S3) serves as the storage mechanism that powers the regulatory reporting use case using the data management framework Apache Hudi. Apache Hudi is useful for our use cases because we need to develop data pipelines where record-level insert, update, upsert, and delete capabilities are desirable. Hudi tables are supported by both Amazon EMR and AWS Glue jobs via the Hudi connector, along with query engines such as Amazon Athena and Amazon Redshift Spectrum.
  6. As part of the data storing process in the regulatory reporting S3 bucket, we can populate the AWS Glue Data Catalog with the required metadata.
  7. Athena provides an ad hoc query environment for interactive analysis of data stored in Amazon S3 using standard SQL. It has an out-of-the-box integration with the AWS Glue Data Catalog.
  8. For the data warehousing use case, we need to first de-normalize data to create a dimensional model that enables optimized analytical queries. For that conversion, we use AWS Glue ETL jobs.
  9. Dimensional data marts in Amazon Redshift enable our dashboard and self-service reporting needs. Data in Amazon Redshift is organized into several subject areas that are aligned with the business needs, and a dimensional model allows for cross-subject area analysis.
  10. As a by-product of creating an Amazon Redshift cluster, we can use Redshift Spectrum to access data in the regulatory reporting bucket of the architecture. It acts as a front to access more granular data without actually loading it in the Amazon Redshift cluster.
  11. The data provided to the Cloud Data Hub contains personal data that is pseudonymized. However, we need our pseudonymized columns to be re-personalized when visualizing them on Tableau or when generating CSV reports. Both Athena and Amazon Redshift support Lambda UDFs, which can be used to access Cloud Data Hub PII APIs to re-personalize the pseudonymized columns before presenting them to end-users.
  12. Both Athena and Amazon Redshift can be accessed via JDBC (Java Database Connectivity) to provide access to data consumers.
  13. We can use a Python shell job in AWS Glue to run a query against either of our analytics solutions, convert the results to the required CSV format, and store them to a BMW secured folder.
  14. Any business intelligence (BI) tool deployed on premises can connect to both Athena and Amazon Redshift and use their query engines to perform any heavy computation before it receives the final data to fuel its dashboards.
  15. For the data pipeline orchestration, we recommended using AWS Step Functions because of its low-code development experience and its full integration with all the other components discussed.

With the preceding architecture as our long-term target state, we concluded the Design Lab and decided to return for a Build Lab to accelerate solution development.

Preparing for Build Lab

The typical preparation of a Build Lab that follows a Design Lab involves identifying a few examples of common use case patterns, typically the more complex ones. To maximize the success in the Build Lab, we reduce the long-term target architecture to a subset of components that addresses those examples and can be implemented within a 3-to-5-day intense sprint.

For a successful Build Lab, we also need to identify and resolve any external dependencies, such as network connectivity to data sources and targets. If that isn’t feasible, then we find meaningful ways to mock them. For instance, to make the prototype closer to what the production environment would look like, we decided to use separate AWS accounts for each use case, based on the existing team structure of BMW, and use a consumer S3 bucket instead of BMW network-attached storage (NAS).

Build Lab

The BMW team set aside 4 days for their Build Lab. During that time, their dedicated Data Lab Architect worked alongside the team, helping them to build the following prototype architecture.

Build Lab Solution

This solution includes the following components:

  1. The first step was to synchronize the AWS Glue Data Catalog of the Cloud Data Hub and regulatory reporting accounts.
  2. AWS Glue jobs running on the regulatory reporting account had access to the data in the Cloud Data Hub resource accounts. During the Build Lab, the BMW team implemented ETL jobs for six tables, addressing insert, update, and delete record requirements using Hudi.
  3. The result of the ETL jobs is stored in the data lake layer stored in the regulatory reporting S3 bucket as Hudi tables that are catalogued in the AWS Glue Data Catalog and can be consumed by multiple AWS services. The bucket is encrypted using AWS Key Management Service (AWS KMS).
  4. Athena is used to run exploratory queries on the data lake.
  5. To demonstrate the cross-account consumption pattern, we created an Amazon Redshift cluster on it, created external tables from the Data Catalog, and used Redshift Spectrum to query the data. To enable cross-account connectivity between the subnet group of the Data Catalog of the regulatory reporting account and the subnet group of the Amazon Redshift cluster on the local data warehouse account, we had to enable VPC peering. To accelerate and optimize the implementation of these configurations during the Build Lab, we received support from an AWS networking subject matter expert, who ran a valuable session, during which the BMW team understood the networking details of the architecture.
  6. For data consumption, the BMW team implemented an AWS Glue Python shell job that connected to Amazon Redshift or Athena using a JDBC connection, ran a query, and stored the results in the reporting bucket as a CSV file, which would later be accessible by the end-users.
  7. End-users can also directly connect to both Athena and Amazon Redshift using a JDBC connection.
  8. We decided to orchestrate the AWS Glue ETL jobs using AWS Glue Workflows. We used the resulting workflow for the end-of-lab demo.

With that, we completed all the goals we had set up and concluded the 4-day Build Lab.


In this post, we walked you through the journey the BMW Financial Services team took with the AWS Data Lab team to participate in a Design Lab to identify a best-fit architecture for their use cases, and the subsequent Build Lab to implement prototypes for regulatory reporting in one of the European BMW market.

To learn more about how AWS Data Lab can help you turn your ideas into solutions, visit AWS Data Lab.

Special thanks to everyone who contributed to the success of the Design and Build Lab: Lionel Mbenda, Mario Robert Tutunea, Marius Abalarus, Maria Dejoie.

About the authors

Martin Zoellner is an IT Specialist at BMW Group. His role in the project is Subject Matter Expert for DevOps and ETL/SW Architecture.

Thomas Ehrlich is the functional maintenance manager of Regulatory Reporting application in one of the European BMW market.

Veronika Bogusch is an IT Specialist at BMW. She initiated the rebuild of the Financial Services Batch Integration Layer via the Cloud Data Hub. The ingested data assets are the base for the Regulatory Reporting use case described in this article.

George Komninos is a solutions architect for the Amazon Web Services (AWS) Data Lab. He helps customers convert their ideas to a production-ready data product. Before AWS, he spent three years at Alexa Information domain as a data engineer. Outside of work, George is a football fan and supports the greatest team in the world, Olympiacos Piraeus.

Rahul Shaurya is a Senior Big Data Architect with AWS Professional Services. He helps and works closely with customers building data platforms and analytical applications on AWS. Outside of work, Rahul loves taking long walks with his dog Barney.

How Epos Now modernized their data platform by building an end-to-end data lake with the AWS Data Lab

Post Syndicated from Debadatta Mohapatra original https://aws.amazon.com/blogs/big-data/how-epos-now-modernized-their-data-platform-by-building-an-end-to-end-data-lake-with-the-aws-data-lab/

Epos Now provides point of sale and payment solutions to over 40,000 hospitality and retailers across 71 countries. Their mission is to help businesses of all sizes reach their full potential through the power of cloud technology, with solutions that are affordable, efficient, and accessible. Their solutions allow businesses to leverage actionable insights, manage their business from anywhere, and reach customers both in-store and online.

Epos Now currently provides real-time and near-real-time reports and dashboards to their merchants on top of their operational database (Microsoft SQL Server). With a growing customer base and new data needs, the team started to see some issues in the current platform.

First, they observed performance degradation for serving the reporting requirements from the same OLTP database with the current data model. A few metrics that needed to be delivered in real time (seconds after a transaction was complete) and a few metrics that needed to be reflected in the dashboard in near-real-time (minutes) took several attempts to load in the dashboard.

This started to cause operational issues for their merchants. The end consumers of reports couldn’t access the dashboard in a timely manner.

Cost and scalability also became a major problem because one single database instance was trying to serve many different use cases.

Epos Now needed a strategic solution to address these issues. Additionally, they didn’t have a dedicated data platform for doing machine learning and advanced analytics use cases, so they decided on two parallel strategies to resolve their data problems and better serve merchants:

  • The first was to rearchitect the near-real-time reporting feature by moving it to a dedicated Amazon Aurora PostgreSQL-Compatible Edition database, with a specific reporting data model to serve to end consumers. This will improve performance, uptime, and cost.
  • The second was to build out a new data platform for reporting, dashboards, and advanced analytics. This will enable use cases for internal data analysts and data scientists to experiment and create multiple data products, ultimately exposing these insights to end customers.

In this post, we discuss how Epos Now designed the overall solution with support from the AWS Data Lab. Having developed a strong strategic relationship with AWS over the last 3 years, Epos Now opted to take advantage of the AWS Data lab program to speed up the process of building a reliable, performant, and cost-effective data platform. The AWS Data Lab program offers accelerated, joint-engineering engagements between customers and AWS technical resources to create tangible deliverables that accelerate data and analytics modernization initiatives.

Working with an AWS Data Lab Architect, Epos Now commenced weekly cadence calls to come up with a high-level architecture. After the objective, success criteria, and stretch goals were clearly defined, the final step was to draft a detailed task list for the upcoming 3-day build phase.

Overview of solution

As part of the 3-day build exercise, Epos Now built the following solution with the ongoing support of their AWS Data Lab Architect.

Epos Now Arch Image

The platform consists of an end-to-end data pipeline with three main components:

  • Data lake – As a central source of truth
  • Data warehouse – For analytics and reporting needs
  • Fast access layer – To serve near-real-time reports to merchants

We chose three different storage solutions:

  • Amazon Simple Storage Service (Amazon S3) for raw data landing and a curated data layer to build the foundation of the data lake
  • Amazon Redshift to create a federated data warehouse with conformed dimensions and star schemas for consumption by Microsoft Power BI, running on AWS
  • Aurora PostgreSQL to store all the data for near-real-time reporting as a fast access layer

In the following sections, we go into each component and supporting services in more detail.

Data lake

The first component of the data pipeline involved ingesting the data from an Amazon Managed Streaming for Apache Kafka (Amazon MSK) topic using Amazon MSK Connect to land the data into an S3 bucket (landing zone). The Epos Now team used the Confluent Amazon S3 sink connector to sink the data to Amazon S3. To make the sink process more resilient, Epos Now added the required configuration for dead-letter queues to redirect the bad messages to another topic. The following code is a sample configuration for a dead-letter queue in Amazon MSK Connect:

Because Epos Now was ingesting from multiple data sources, they used Airbyte to transfer the data to a landing zone in batches. A subsequent AWS Glue job reads the data from the landing bucket , performs data transformation, and moves the data to a curated zone of Amazon S3 in optimal format and layout. This curated layer then became the source of truth for all other use cases. Then Epos Now used an AWS Glue crawler to update the AWS Glue Data Catalog. This was augmented by the use of Amazon Athena for doing data analysis. To optimize for cost, Epos Now defined an optimal data retention policy on different layers of the data lake to save money as well as keep the dataset relevant.

Data warehouse

After the data lake foundation was established, Epos Now used a subsequent AWS Glue job to load the data from the S3 curated layer to Amazon Redshift. We used Amazon Redshift to make the data queryable in both Amazon Redshift (internal tables) and Amazon Redshift Spectrum. The team then used dbt as an extract, load, and transform (ELT) engine to create the target data model and store it in target tables and views for internal business intelligence reporting. The Epos Now team wanted to use their SQL knowledge to do all ELT operations in Amazon Redshift, so they chose dbt to perform all the joins, aggregations, and other transformations after the data was loaded into the staging tables in Amazon Redshift. Epos Now is currently using Power BI for reporting, which was migrated to the AWS Cloud and connected to Amazon Redshift clusters running inside Epos Now’s VPC.

Fast access layer

To build the fast access layer to deliver the metrics to Epos Now’s retail and hospitality merchants in near-real time, we decided to create a separate pipeline. This required developing a microservice running a Kafka consumer job to subscribe to the same Kafka topic in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster. The microservice received the messages, conducted the transformations, and wrote the data to a target data model hosted on Aurora PostgreSQL. This data was delivered to the UI layer through an API also hosted on Amazon EKS, exposed through Amazon API Gateway.


The Epos Now team is currently building both the fast access layer and a centralized lakehouse architecture-based data platform on Amazon S3 and Amazon Redshift for advanced analytics use cases. The new data platform is best positioned to address scalability issues and support new use cases. The Epos Now team has also started offloading some of the real-time reporting requirements to the new target data model hosted in Aurora. The team has a clear strategy around the choice of different storage solutions for the right access patterns: Amazon S3 stores all the raw data, and Aurora hosts all the metrics to serve real-time and near-real-time reporting requirements. The Epos Now team will also enhance the overall solution by applying data retention policies in different layers of the data platform. This will address the platform cost without losing any historical datasets. The data model and structure (data partitioning, columnar file format) we designed greatly improved query performance and overall platform stability.


Epos Now revolutionized their data analytics capabilities, taking advantage of the breadth and depth of the AWS Cloud. They’re now able to serve insights to internal business users, and scale their data platform in a reliable, performant, and cost-effective manner.

The AWS Data Lab engagement enabled Epos Now to move from idea to proof of concept in 3 days using several previously unfamiliar AWS analytics services, including AWS Glue, Amazon MSK, Amazon Redshift, and Amazon API Gateway.

Epos Now is currently in the process of implementing the full data lake architecture, with a rollout to customers planned for late 2022. Once live, they will deliver on their strategic goal to provide real-time transactional data and put insights directly in the hands of their merchants.

About the Authors

Jason Downing is VP of Data and Insights at Epos Now. He is responsible for the Epos Now data platform and product direction. He specializes in product management across a range of industries, including POS systems, mobile money, payments, and eWallets.

Debadatta Mohapatra is an AWS Data Lab Architect. He has extensive experience across big data, data science, and IoT, across consulting and industrials. He is an advocate of cloud-native data platforms and the value they can drive for customers across industries.

A pathway to the cloud: Analysis of the Reserve Bank of New Zealand’s Guidance on Cyber Resilience

Post Syndicated from Julian Busic original https://aws.amazon.com/blogs/security/a-pathway-to-the-cloud-analysis-of-the-reserve-bank-of-new-zealands-guidance-on-cyber-resilience/

The Reserve Bank of New Zealand’s (RBNZ’s) Guidance on Cyber Resilience (referred to as “Guidance” in this post) acknowledges the benefits of RBNZ-regulated financial services companies in New Zealand (NZ) moving to the cloud, as long as this transition is managed prudently—in other words, as long as entities understand the risks involved and manage them appropriately. In this blog post, I analyze the RBNZ’s thinking as it developed the Guidance, and how the Guidance creates opportunities for NZ financial services customers to accelerate migration of workloads—including critical systems—to the Amazon Web Services (AWS) Cloud.

On page 14 of its Guidance, the RBNZ writes that “[i]f used prudently, third-party services may reduce an entity’s cyber risk, especially for those entities that lack cyber expertise.” This open regulatory stance towards the cloud enables our NZ financial services customers to consider a cloud first strategy for both new and existing systems, including critical workloads. Customers must, however, manage the transition to the cloud prudently, working closely with both their cloud service provider and their regulators.

This blog post is aimed at boards, management, and technology decision-makers, for whom understanding regulatory thinking is a useful input when developing an enterprise cloud strategy.

Operational technology staff and risk practitioners seeking detailed guidance on how AWS helps you align with the RBNZ’s Guidance can download our New Zealand Financial Services whitepaper from our public website and the AWS Reserve Bank of New Zealand Guidance on Cyber Resilience (RBNZ-GCR) Workbook from AWS Artifact, a self-service portal for you to access AWS compliance reports.

Overview and applicability

The RBNZ’s Guidance sets out the RBNZ’s expectations for management of cyber resilience. It’s aimed at all registered banks, licensed non-bank deposit takers, licensed insurers, and designated financial market infrastructures that are regulated by the RBNZ. The Guidance makes a series of non-binding recommendations across four domains—Governance, Capability Building, Information Sharing, and Third-Party Management.

Each section of the Guidance has a short preamble, summarizing the RBNZ’s expectations for effective risk management in each domain and providing insights into why the RBNZ is making specific recommendations.

The Guidance can be tailored to an entity’s individual needs, technology choices, and risk appetite. Boards, management, and technology decision-makers should familiarize themselves with the RBNZ’s Guidance, ascertain how closely their own organization aligns to it, and work to remediate any identified gaps.

Why non-binding guidance and not an enforceable standard?

The RBNZ gives several reasons (see RBNZ Summary of submissions, paragraphs 9-16) for choosing to publish non-binding recommendations rather than legally binding requirements. The RBNZ declares an intent to monitor adoption of its recommendations by industry, and indicates that future policy settings might include developing legally binding standards for cyber resilience. In this respect, the RBNZ’s approach is similar to that of the Australian Prudential Regulation Authority (APRA), which first issued non-binding guidance on management of IT security risk in 2013, before moving to a legally binding standard in 2019.

The RBNZ gives the following reasons for choosing guidance over a standard:

  • The RBNZ’s policy stance of being moderately active in respect to cyber resilience
  • A previous light-touch approach regarding cyber resilience
  • Providing sufficient time for industry to adjust to new policy settings, given the wide range of maturity within financial services organizations in New Zealand
  • The gap between New Zealand’s and other jurisdictions’ cyber readiness
  • The RBNZ’s current ability to effectively monitor and ensure compliance

The RBNZ indicates that it will “work together with the industry to operationalise the finalised Guidance” (RBNZ Summary of submissions, paragraph 10) and that it is “looking to strengthen [its] cyber resilience expertise in [its] financial stability function” although this will “take time to achieve” (RBNZ Summary of submissions, paragraph 9).

RBNZ-regulated entities should already be self-assessing against the Guidance and working to address gaps as a matter of priority. This is not just because the Guidance could become a legally binding standard in the next 3–5 years, but because the RBNZ has created a practical and flexible framework for the management of cyber risk, which will greatly enhance the NZ financial sector’s resilience to cyber incidents. Non–RBNZ-regulated entities looking for a benchmark to measure themselves against can also use the RBNZ’s Guidance to assess and improve the effectiveness of their own control environments.

Comparing rules-based frameworks and principles-based frameworks

There are two main ways that regulators communicate their risk management expectations to their regulated entities. These are a rules-based approach (sometimes called a compliance-based approach) and a principles-based approach. The RBNZ’s Guidance takes a principles-based approach towards the management of cyber risk.

With a rules-based approach, the regulator takes responsibility for identifying risks and lays out explicit and granular controls that regulated entities are required to implement. A rules-based approach is highly prescriptive, meaning that regulated entities can adopt a checklist approach in meeting their regulators’ requirements. This approach, although it gives certainty to regulated entities regarding the controls they are expected to adopt, can have disadvantages for regulators:

  • Creating and maintaining detailed technical rules can be challenging, given the pace at which technology and the threat environment evolve.
  • Regulators have a diverse population of regulated entities, so a rules-based approach can be inflexible or have blind spots.
  • A rules-based approach doesn’t encourage entities to actively identify and manage their own unique set of risks.

By contrast, a principles-based approach describes a set of desired regulatory or risk-management outcomes, but it isn’t prescriptive in how regulated entities achieve these goals. Regulators act in a vendor- and technology-neutral manner, and regulated entities are expected to interpret regulatory requirements or guidance in the context of their individual business models, technology choices, threat environments, and risk appetites.

Under a principles-based approach, an entity must be able to demonstrate to its regulators’ satisfaction that it both understands the current and emerging risks it faces, and that it is managing these risks appropriately. For example, the principle that entities “[…] should develop and maintain a programme for continuing cyber resilience training for staff at all levels” (Guidance, section A3.3 page 6) gives clear direction, but leaves it up to the entity to decide on the approach to take, and how the entity will demonstrate to the RBNZ that this principle is being met.

A principles-based approach avoids the issues with the rules-based approach that I outlined previously—this approach is significantly longer-lived than a rules-based approach, it moves responsibility for effective risk identification and management from the regulator to the entity (which better understands its own risk profile and appetite), and the framework can be applied to a regulated entity population that varies in size, nature, and complexity.

Freedom to innovate under a principles-based approach

The RBNZ says that its Guidance should be employed in a manner “[…] proportionate to the size, structure and operational environment of an entity, as well as the nature, scope, complexity and risk profile of its products and services” (Guidance, page 2).

You can therefore meet the RBNZ’s Guidance in many different ways, as long as you can demonstrate to the RBNZ that your organization understands the risks it is facing and is managing them appropriately. A principles-based approach creates opportunities for innovation, because there are many different ways to meet a set of regulatory principles.

If you are an NZ financial services customer who also operates in Australia, you might note that the RBNZ’s approach aligns to that of the principal financial services regulator in Australia—the Australian Prudential Regulation Authority (APRA). APRA also takes a principles-based approach to its prudential framework, “avoiding excessive prescription where possible to allow for the diversity of practice according to the size, business activity, and sophistication of the institutions being supervised” (APRA’s objectives, Chapter 1).

A cautious green light to the cloud for New Zealand financial services

“If used prudently, third-party services may reduce an entity’s cyber risk, especially for those entities that lack cyber expertise” (Guidance, page 14).

In my view, this statement represents a (cautious) green light for financial services customers in NZ who wish to migrate systems to the AWS Cloud, although as the RBNZ makes clear, you “should be fully aware of the cyber risk associated with third parties and act appropriately to mitigate that risk” (Guidance, page 14). The RBNZ also requests that for critical functions, entities “[…] should inform the Reserve Bank about their outsourcing of critical functions to cloud service providers early in their decision-making process” (Guidance, Section D8.1, page 17).

The RBNZ defines a critical function as “[a]ny activity, function, process, or service, the loss of which (for even a short period of time) would materially affect the continued operation of an entity, the market it serves and the broader financial system, and/or materially affect the data integrity, reputation of an entity and confidence in the financial system” (Guidance, page 19).

Although the RBNZ doesn’t elaborate further on why it requests early notification about outsourcing of critical functions to the cloud, it’s likely that early engagement is requested so that the RBNZ has the opportunity to provide early feedback on any areas of potential concern, before the initiative is significantly progressed and a large amount of resources are committed.

Migration of higher-risk workloads to the cloud will naturally attract higher levels of regulatory scrutiny, but this doesn’t change the RBNZ’s open regulatory stance on cloud security. This stance is further emphasized by the RBNZ’s comment that “If managed prudently, migrating to the cloud presents a number of benefits including geographically dispersed infrastructures, agility to scale more quickly, improved automation, sufficient redundancy, and reduced initial investment costs for individual financial institutions” (Guidance, page 15).

Building innovative, secure, and highly resilient solutions on AWS, and using the high levels of visibility that you have into your environments that are running on AWS, can help you demonstrate to your regulators how you are identifying and managing your cyber resilience risks in line with the RBNZ’s Guidance.

A note on regulatory myths

In conversations with customers, I occasionally encounter “regulatory myths,” such as “certain types of workloads are prohibited in the cloud,” or “my regulator won’t allow me to use multi-region architectures.”

To date, the RBNZ has not made specific recommendations or set specific requirements regarding technology solutions. This includes, but is not limited to, choice of vendors or technology platforms, prescription of particular architectures, or the types of workload that may or may not be migrated to the cloud. Remember, the RBNZ’s Guidance is a principles-based framework, and is vendor-, technology-, and solution-neutral.

We have many examples of financial services companies all over the world successfully running critical workloads in the AWS Cloud, but regulatory myths and misunderstandings can inhibit our customers’ ability to “think big” when developing their cloud strategies. If you believe that you must implement specific technical patterns to meet regulatory expectations, we encourage you to contact the RBNZ to discuss any aspects of the Guidance that require clarification. We also encourage you to contact your AWS account team, who can arrange support from internal AWS risk and regulatory specialists, particularly if critical systems are proposed for migration to AWS.


The RBNZ’s Guidance on Cyber Resilience is an important first step for financial services regulation of cybersecurity in NZ. The Guidance can be considered cloud friendly because it acknowledges that prudent use of third parties (such as AWS) can reduce cyber risk, especially for entities that lack cyber expertise, and outlines several benefits of the cloud over traditional on-premises infrastructure, including resilience and redundancy, ability to scale, and reduced initial investment costs.

The principles-based nature of the RBNZ’s Guidance creates opportunities for you to develop innovative solutions in the AWS Cloud, because there are many different ways to meet the principles contained in the RBNZ’s Guidance. The key consideration is that you demonstrate to your regulators that you both understand the cyber risks you face in moving to the AWS Cloud, and manage them appropriately.

The launch of the AWS Asia Pacific (Auckland) Region in 2024, our wide range of products and services, and the visibility that you have into the AWS control environment (through AWS Artifact) and your own environment (through services like Amazon GuardDuty and AWS Security Hub) can all help you demonstrate to the RBNZ that you are managing cyber risk in accordance with the RBNZ’s expectations.

Next steps

Boards, executives, and technology decision-makers should familiarize themselves with the RBNZ’s Guidance, and if they aren’t already doing so, conduct a self-assessment and initiate a body of work to address identified gaps.

In view of the RBNZ’s cautious green light for prudent migration to the cloud—including for critical systems—NZ financial services customers should review their existing cloud strategies and identify areas where they can both broaden and accelerate their cloud journeys. The AWS Cloud Adoption Framework (AWS CAF) offers guidance and best practices to help organizations develop an efficient and effective plan for their cloud adoption journey. The AWS C-suite Guide to Shared Responsibility for Cloud Security and Data Safe Cloud eBook inform boards and senior management about both the benefits and risks of operating in the cloud.

Operational technology staff and risk practitioners can download our New Zealand Financial Service whitepaper from our public website and the AWS Reserve Bank of New Zealand Guidance on Cyber Resilience (RBNZ-GCR) Workbook from AWS Artifact. The RBNZ-GCR is particularly useful for operational IT staff and risk practitioners because it provides prescriptive guidance on which controls to implement on your side of the shared responsibility model and which AWS controls you inherit from the service.

Finally, contact your AWS representative to discuss how the AWS Partner Network, AWS solution architects, AWS Professional Services teams, and AWS Training and Certification can assist with your cloud adoption journey. If you don’t have an AWS representative, contact us at https://aws.amazon.com/contact-us.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.


Julian Busic

Julian is a Security Solutions Architect with a focus on regulatory engagement. He works with our customers, their regulators, and AWS teams to help customers raise the bar on secure cloud adoption and usage. Julian has over 15 years of experience working in risk and technology across the financial services industry in Australia and New Zealand.

OSPAR 2022 report now available with 142 services in scope

Post Syndicated from Joseph Goh original https://aws.amazon.com/blogs/security/ospar-2022-report-now-available-with-142-services-in-scope/

We’re excited to announce the completion of our annual Outsourced Service Provider’s Audit Report (OSPAR) audit cycle on July 1, 2022. The 2022 OSPAR certification cycle includes the addition of 15 new services in scope, bringing the total number of services in scope to 142 in the AWS Asia Pacific (Singapore) Region.

Newly added services in scope include the following:

Successful completion of the OSPAR assessment demonstrates that AWS has a system of controls in place that meet the Association of Banks in Singapore (ABS) Guidelines on Control Objectives and Procedures for Outsourced Service Providers. Our alignment with the ABS guidelines demonstrates our commitment to meet the security expectations for cloud service providers set by the financial services industry in Singapore. Customers can use the OSPAR assessment to conduct due diligence, and to help reduce the effort and costs required for compliance. An independent third-party auditor, selected from the ABS list of approved auditors, performs the OSPAR assessment.

You can download the latest OSPAR report from AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact. The list of services in scope for OSPAR is available in the report, and is also available on the AWS Services in Scope by Compliance Program webpage.

As always, we’re committed to bringing new services into the scope of our OSPAR program based on your architectural and regulatory needs. If you have questions about the OSPAR report, contact your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Joseph Goh

Joseph Goh

Joseph is the APJ ASEAN Lead at AWS based in Singapore. He leads security audits, certifications and compliance programs across the Asia Pacific region. Joseph is passionate about delivering programs that build trust with customers and provide them assurance on cloud security.

New AWS whitepaper: AWS User Guide to Financial Services Regulations and Guidelines in New Zealand

Post Syndicated from Julian Busic original https://aws.amazon.com/blogs/security/new-aws-whitepaper-aws-user-guide-to-financial-services-regulations-and-guidelines-in-new-zealand/

Amazon Web Services (AWS) has released a new whitepaper to help financial services customers in New Zealand accelerate their use of the AWS Cloud.

The new AWS User Guide to Financial Services Regulations and Guidelines in New Zealand—along with the existing AWS Workbook for the RBNZ’s Guidance on Cyber Resilience—continues our efforts to help AWS customers navigate the regulatory expectations of the Reserve Bank of New Zealand (RBNZ) in a shared responsibility environment.

This whitepaper is intended for RBNZ-regulated institutions that are looking to run material workloads in the AWS Cloud, and is particularly useful for leadership, security, risk, and compliance teams that need to understand RBNZ requirements and guidance.

The whitepaper summarizes RBNZ requirements and guidance related to outsourcing, cyber resilience, and the cloud. It also gives RBNZ-regulated institutions information they can use to commence their due diligence and assess how to implement the appropriate programs for their use of AWS cloud services.

This document joins existing guides for other jurisdictions in the Asia Pacific region, such as Australia, India, Singapore, and Hong Kong. As the regulatory environment continues to evolve, we’ll provide further updates on the AWS Security Blog and the AWS Compliance page. You can find more information on cloud-related regulatory compliance at the AWS Compliance Center. You can also reach out to your AWS account manager for help finding the resources you need.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.


Julian Busic

Julian is a Security Solutions Architect with a focus on regulatory engagement. He works with our customers, their regulators, and AWS teams to help customers raise the bar on secure cloud adoption and usage. Julian has over 15 years of experience working in risk and technology across the financial services industry in Australia and New Zealand.

Monitoring and alerting break-glass access in an AWS Organization

Post Syndicated from Haresh Nandwani original https://aws.amazon.com/blogs/architecture/monitoring-and-alerting-break-glass-access-in-an-aws-organization/

Organizations building enterprise-scale systems require the setup of a secure and governed landing zone to deploy and operate their systems. A landing zone is a starting point from which your organization can quickly launch and deploy workloads and applications with confidence in your security and infrastructure environment as described in What is a landing zone?. Nationwide Building Society (Nationwide) is the world’s largest building society. It is owned by its 16 million members and exists to serve their needs. The Society is one of the UK’s largest providers for mortgages, savings and current accounts, as well as being a major provider of ISAs, credit cards, personal loans, insurance, and investments.

For one of its business initiatives, Nationwide utilizes AWS Control Tower to build and operate their landing zone which provides a well-established pattern to set up and govern a secure, multi-account AWS environment. Nationwide operates in a highly regulated industry and our governance assurance requires adequate control of any privileged access to production line-of-business data or to resources which have access to them. We chose for this specific business initiative to deploy our landing zone using AWS Organizations, to benefit from ongoing account management and governance as aligned with AWS implementation best practices. We also utilized AWS Single Sign-On (AWS SSO) to create our workforce identities in AWS once and manage access centrally across our AWS Organization. In this blog, we describe the integrations required across AWS Control Tower and AWS SSO to implement a break-glass mechanism that makes access reporting publishable to system operators as well as to internal audit systems and processes. We will outline how we used AWS SSO for our setup as well as the three architecture options we considered, and why we went with the chosen solution.

Sourcing AWS SSO access data for near real-time monitoring

In our setup, we have multiple AWS Accounts and multiple trails on each of these accounts. Users will regularly navigate across multiple accounts as they operate our infrastructure, and their journeys are marked across these multiple trails. Typically, AWS CloudTrail would be our chosen resource to clearly and unambiguously identify account or data access.  The key challenge in this scenario was to design an efficient and cost-effective solution to scan these trails to help identify and report on break-glass user access to account and production data. To address this challenge, we developed the following two architecture design options.

Option 1: A decentralized approach that uses AWS CloudFormation StackSets, Amazon EventBridge and AWS Lambda

Our solution entailed a decentralized approach by deploying a CloudFormation StackSet to create, update, or delete stacks across multiple accounts and AWS Regions with a single operation. The Stackset created Amazon EventBridge rules and target AWS Lambda functions. These functions post to EventBridge in our audit account. Our audit account has a set of Lambda functions running off EventBridge to initiate specific events, format the event message and post to Slack, our centralized communication platform for this implementation. Figure 1 depicts the overall architecture for this option.

De-centralized logging using Amazon EventBridge and AWS Lambda

Figure 1. De-centralized logging using Amazon EventBridge and AWS Lambda

Option 2: Use an organization trail in the Organization Management account

This option uses the centralized organization trail in the Organization Management account to source audit data. Details of how to create an organization trail can be found in the AWS CloudTrail User Guide. CloudTrail was configured to send log events to CloudWatch Logs. These events are then sent via Lambda functions to Slack using webhooks. We used a public terraform module in this GitHub repository to build this Lambda Slack integration. Figure 2 depicts the overall architecture for this option.

Centralized logging pattern using Amazon CloudWatch

Figure 2. Centralized logging pattern using Amazon CloudWatch

This was our preferred option and is the one we finally implemented.

We also evaluated a third option which was to use centralized logging and auditing feature enabled by Control Tower. Users authenticate and federate to target accounts from a central location so it seemed possible to source this info from the centralized logs. These log events arrive as .gz compressed json objects, which meant having to expand these archives repeatedly for inspection. We therefore decided against this option.

A centralized, economic, extensible solution to alert of SSO break-glass

Our requirement was to identify break-glass access across any of the access mechanisms supported by AWS, including CLI and User Portal access. To ensure we have comprehensive coverage across all access mechanisms, we identified all the events initiated for each access mechanism:

  1. User Portal/AWS Console access events
    • Authenticate
    • ListApplications
    • ListApplicationProfiles
    • Federate – this event contains the role that the user is federating into
  2. CLI access events
    • CreateToken
    • ListAccounts
    • ListAccountRoles
    • GetRoleCredentials – this event contains the role that the user is federating into

EventBridge is able to initiate actions after events only when the event is trying to perform changes (when the “readOnly” attribute on the event record body equals “false”).

The AWS support team was aware of this attribute and recommended that we, change the data flow we were using to one able to initiate actions after any kind of event, regardless of the value on its readOnly attribute. The solution in our case was to send the CloudTrail logs to CloudWatch Logs. This then and initiates the Lambda function through a filter subscription that detects the desired event names on the log content.

The filter used is as follows:

{($.eventSource = sso.amazonaws.com) && ($.eventName = Federate||$.eventName = GetRoleCredentials)}

Due to the query size in the CloudWatch Log queries we had to remove the subscription filters and do the parsing of the content of the log lines inside the lambda function. In order to determine what accounts would initiate the notifications, we sent the list of accounts and roles to it as an environment variable at runtime.

Considerations with cross-account SSO access

With direct federation users get an access token. This is most obvious in AWS single sign on at the chiclet page as “Command line or programmatic access”. SSO tokens have a limited lifetime (we use the default 1-hour). A user does not have to get a new token to access a target resource until the one they are using is expired. This means that a user may repeatedly access a target account using the same token during its lifetime. Although the token is made available at the chiclet page, the GetRoleCredentials event does not occur until it is used to authenticate an API call to the target AWS account.


In this blog, we discussed how AWS Control Tower and AWS Single Sign-on enabled Nationwide to build and govern a secure, multi-account AWS environment for one of their business initiatives and centralize access management across our implementation. The integration was important for us to accurately and comprehensively identify and audit break-glass access for our implementation. As a result, we were able to satisfy our security and compliance audit requirements for privileged access to our AWS accounts.

AWS User Guide to Financial Services Regulations and Guidelines in Switzerland and FINMA workbooks publications

Post Syndicated from Margo Cronin original https://aws.amazon.com/blogs/security/aws-user-guide-to-financial-services-regulations-and-guidelines-in-switzerland-and-finma/

AWS is pleased to announce the publication of the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland whitepaper and workbooks.

This guide refers to certain rules applicable to financial institutions in Switzerland, including banks, insurance companies, stock exchanges, securities dealers, portfolio managers, trustees and other financial entities which are overseen (directly or indirectly) by the Swiss Financial Market Supervisory Authority (FINMA).

Amongst other topics, this guide covers requirements created by the following regulations and publications of interest to Swiss financial institutions:

  • Federal Laws – including Article 47 of the Swiss Banking Act (BA). Banks and Savings Banks are overseen by FINMA and governed by the BA (Bundesgesetz über die Banken und Sparkassen, Bankengesetz, BankG). Article 47 BA holds relevance in the context of outsourcing.
  • Response on Cloud Guidelines for Swiss Financial institutions produced by the Swiss Banking Union, Schweizerische Bankiervereinigung SBVg.
  • Controls outlined by FINMA, Switzerland’s independent regulator of financial markets, that may be applicable to Swiss banks and insurers in the context of outsourcing arrangements to the cloud.

In combination with the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland whitepaper, customers can use the detailed AWS FINMA workbooks and ISAE 3000 report available from AWS Artifact.

The five core FINMA circulars are intended to assist Swiss-regulated financial institutions in understanding approaches to due diligence, third-party management, and key technical and organizational controls that should be implemented in cloud outsourcing arrangements, particularly for material workloads. The AWS FINMA workbooks and ISAE 3000 report scope covers, in detail, requirements of the following FINMA circulars:

  • 2018/03 Outsourcing – banks and insurers (04.11.2020)
  • 2008/21 Operational Risks – Banks – Principle 4 Technology Infrastructure (31.10.2019)
  • 2008/21 Operational Risks – Banks – Appendix 3 Handling of electronic Client Identifying Data (31.10.2019)
  • 2013/03 Auditing – Information Technology (04.11.2020)
  • 2008/10 Self-regulation as a minimum standard – Minimum Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

Customers can use the detailed FINMA workbooks, which include detailed control mappings for each FINMA control, covering both the AWS control activities and the Customer User Entity Controls. Where applicable, under the AWS Shared Responsibility Model, these workbooks provide industry standard practices, incorporating AWS Well-Architected, to assist Swiss customers in their own preparation for FINMA circular alignment.

This whitepaper follows the issuance of the second Swiss Financial Market Supervisory Authority (FINMA) ISAE 3000 Type 2 attestation report. The latest report covers the period from October 1, 2020 to September 30, 2021, with a total of 141 AWS services and 23 global AWS Regions included in the scope. Customers can download the report from AWS Artifact. A full list of certified services and Regions is presented within the published FINMA report.

As always, AWS is committed to bringing new services into the scope of our FINMA program in the future based on customers’ architectural and regulatory needs. Please reach out to your AWS account team if you have any questions or feedback. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.


Margo Cronin

Margo is a Principal Security Specialist at Amazon Web Services based in Zurich, Switzerland. She spends her days working with customers, from startups to the largest of enterprises, helping them build new capabilities and accelerating their cloud journey. She has a strong focus on security, helping customers improve their security, risk, and compliance in the cloud.

Doing more with less: Moving from transactional to stateful batch processing

Post Syndicated from Tom Jin original https://aws.amazon.com/blogs/big-data/doing-more-with-less-moving-from-transactional-to-stateful-batch-processing/

Amazon processes hundreds of millions of financial transactions each day, including accounts receivable, accounts payable, royalties, amortizations, and remittances, from over a hundred different business entities. All of this data is sent to the eCommerce Financial Integration (eCFI) systems, where they are recorded in the subledger.

Ensuring complete financial reconciliation at this scale is critical to day-to-day accounting operations. With transaction volumes exhibiting double-digit percentage growth each year, we found that our legacy transactional-based financial reconciliation architecture proved too expensive to scale and lacked the right level of visibility for our operational needs.

In this post, we show you how we migrated to a batch processing system, built on AWS, that consumes time-bounded batches of events. This not only reduced costs by almost 90%, but also improved visibility into our end-to-end processing flow. The code used for this post is available on GitHub.

Legacy architecture

Our legacy architecture primarily utilized Amazon Elastic Compute Cloud (Amazon EC2) to group related financial events into stateful artifacts. However, a stateful artifact could refer to any persistent artifact, such as a database entry or an Amazon Simple Storage Service (Amazon S3) object.

We found this approach resulted in deficiencies in the following areas:

  • Cost – Individually storing hundreds of millions of financial events per day in Amazon S3 resulted in high I/O and Amazon EC2 compute resource costs.
  • Data completeness – Different events flowed through the system at different speeds. For instance, while a small stateful artifact for a single customer order could be recorded in a couple of seconds, the stateful artifact for a bulk shipment containing a million lines might require several hours to update fully. This made it difficult to know whether all the data had been processed for a given time range.
  • Complex retry mechanisms – Financial events were passed between legacy systems using individual network calls, wrapped in a backoff retry strategy. Still, network timeouts, throttling, or traffic spikes could result in some events erroring out. This required us to build a separate service to sideline, manage, and retry problematic events at a later date.
  • Scalability – Bottlenecks occurred when different events competed to update the same stateful artifact. This resulted in excessive retries or redundant updates, making it less cost-effective as the system grew.
  • Operational support – Using dedicated EC2 instances meant that we needed to take valuable development time to manage OS patching, handle host failures, and schedule deployments.

The following diagram illustrates our legacy architecture.

Transactional-based legacy architecture

Evolution is key

Our new architecture needed to address the deficiencies while preserving the core goal of our service: update stateful artifacts based on incoming financial events. In our case, a stateful artifact refers to a group of related financial transactions used for reconciliation. We considered the following as part of the evolution of our stack:

  • Stateless and stateful separation
  • Minimized end-to-end latency
  • Scalability

Stateless and stateful separation

In our transactional system, each ingested event results in an update to a stateful artifact. This became a problem when thousands of events came in all at once for the same stateful artifact.

However, by ingesting batches of data, we had the opportunity to create separate stateless and stateful processing components. The stateless component performs an initial reduce operation on the input batch to group together related events. This meant that the rest of our system could operate on these smaller stateless artifacts and perform fewer write operations (fewer operations means lower costs).

The stateful component would then join these stateless artifacts with existing stateful artifacts to produce an updated stateful artifact.

As an example, imagine an online retailer suddenly received thousands of purchases for a popular item. Instead of updating an item database entry thousands of times, we can first produce a single stateless artifact that summaries the latest purchases. The item entry can now be updated one time with the stateless artifact, reducing the update bottleneck. The following diagram illustrates this process.

Batch visualization

Minimized end-to-end latency

Unlike traditional extract, transform, and load (ETL) jobs, we didn’t want to perform daily or even hourly extracts. Our accountants need to be able to access the updated stateful artifacts within minutes of data arriving in our system. For instance, if they had manually sent a correction line, they wanted to be able to check within the same hour that their adjustment had the intended effect on the targeted stateful artifact instead of waiting until the next day. As such, we focused on parallelizing the incoming batches of data as much as possible by breaking down the individual tasks of the stateful component into subcomponents. Each subcomponent could run independently of each other, which allowed us to process multiple batches in an assembly line format.


Both the stateless and stateful components needed to respond to shifting traffic patterns and possible input batch backlogs. We also wanted to incorporate serverless compute to better respond to scale while reducing the overhead of maintaining an instance fleet.

This meant we couldn’t simply have a one-to-one mapping between the input batch and stateless artifact. Instead, we built flexibility into our service so the stateless component could automatically detect a backlog of input batches and group multiple input batches together in one job. Similar backlog management logic was applied to the stateful component. The following diagram illustrates this process.

Batch scalability

Current architecture

To meet our needs, we combined multiple AWS products:

  • AWS Step Functions – Orchestration of our stateless and stateful workflows
  • Amazon EMR – Apache Spark operations on our stateless and stateful artifacts
  • AWS Lambda – Stateful artifact indexing and orchestration backlog management
  • Amazon ElastiCache – Optimizing Amazon S3 request latency
  • Amazon S3 – Scalable storage of our stateless and stateful artifacts
  • Amazon DynamoDB – Stateless and stateful artifact index

The following diagram illustrates our current architecture.

Current architecture

The following diagram shows our stateless and stateful workflow.


The AWS CloudFormation template to render this architecture and corresponding Java code is available in the following GitHub repo.

Stateless workflow

We used an Apache Spark application on a long-running Amazon EMR cluster to simultaneously ingest input batch data and perform reduce operations to produce the stateless artifacts and a corresponding index file for the stateful processing to use.

We chose Amazon EMR for its proven highly available data-processing capability in a production setting and also its ability to horizontally scale when we see increased traffic loads. Most importantly, Amazon EMR had lower cost and better operational support when compared to a self-managed cluster.

Stateful workflow

Each stateful workflow performs operations to create or update millions of stateful artifacts using the stateless artifacts. Similar to the stateless workflows, all stateful artifacts are stored in Amazon S3 across a handful of Apache Spark part-files. This alone resulted in a huge cost reduction, because we significantly reduced the number of Amazon S3 writes (while using the same amount of overall storage). For instance, storing 10 million individual artifacts using the transactional legacy architecture would cost $50 in PUT requests alone, whereas 10 Apache Spark part-files would cost only $0.00005 in PUT requests (based on $0.005 per 1,000 requests).

However, we still needed a way to retrieve individual stateful artifacts, because any stateful artifact could be updated at any point in the future. To do this, we turned to DynamoDB. DynamoDB is a fully managed and scalable key-value and document database. It’s ideal for our access pattern because we wanted to index the location of each stateful artifact in the stateful output file using its unique identifier as a primary key. We used DynamoDB to index the location of each stateful artifact within the stateful output file. For instance, if our artifact represented orders, we would use the order ID (which has high cardinality) as the partition key, and store the file location, byte offset, and byte length of each order as separate attributes. By passing the byte-range in Amazon S3 GET requests, we can now fetch individual stateful artifacts as if they were stored independently. We were less concerned about optimizing the number of Amazon S3 GET requests because the GET requests are over 10 times cheaper than PUT requests.

Overall, this stateful logic was split across three serial subcomponents, which meant that three separate stateful workflows could be operating at any given time.


The following diagram illustrates our pre-fetcher subcomponent.

Prefetcher architecture

The pre-fetcher subcomponent uses the stateless index file to retrieve pre-existing stateful artifacts that should be updated. These might be previous shipments for the same customer order, or past inventory movements for the same warehouse. For this, we turn once again to Amazon EMR to perform this high-throughput fetch operation.

Each fetch required a DynamoDB lookup and an Amazon S3 GET partial byte-range request. Due to the large number of external calls, fetches were highly parallelized using a thread pool contained within an Apache Spark flatMap operation. Pre-fetched stateful artifacts were consolidated into an output file that was later used as input to the stateful processing engine.

Stateful processing engine

The following diagram illustrates the stateful processing engine.

Stateful processor architecture

The stateful processing engine subcomponent joins the pre-fetched stateful artifacts with the stateless artifacts to produce updated stateful artifacts after applying custom business logic. The updated stateful artifacts are written out across multiple Apache Spark part-files.

Because stateful artifacts could have been indexed at the same time that they were pre-fetched (also called in-flight updates), the stateful processor also joins recently processed Apache Spark part-files.

We again used Amazon EMR here to take advantage of the Apache Spark operations that are required to join the stateless and stateful artifacts.

State indexer

The following diagram illustrates the state indexer.

State Indexer architecture

This Lambda-based subcomponent records the location of each stateful artifact within the stateful part-file in DynamoDB. The state indexer also caches the stateful artifacts in an Amazon ElastiCache for Redis cluster to provide a performance boost in the Amazon S3 GET requests performed by the pre-fetcher.

However, even with a thread pool, a single Lambda function isn’t powerful enough to index millions of stateful artifacts within the 15-minute time limit. Instead, we employ a cluster of Lambda functions. The state indexer begins with a single coordinator Lambda function, which determines the number of worker functions that are needed. For instance, if 100 part-files are generated by the stateful processing engine, then the coordinator might assign five part-files for each of the 20 Lambda worker functions to work on. This method is highly scalable because we can dynamically assign more or fewer Lambda workers as required.

Each Lambda worker then performs the ElastiCache and DynamoDB writes for all the stateful artifacts within each assigned part-file in a multi-threaded manner. The coordinator function monitors the health of each Lambda worker and restarts workers as needed.

Distributed Lambda architecture


We used Step Functions to coordinate each of the stateless and stateful workflows, as shown in the following diagram.

Step Function Workflow

Every time a new workflow step ran, the step was recorded in a DynamoDB table via a Lambda function. This table not only maintained the order in which stateful batches should be run, but it also formed the basis of the backlog management system, which directed the stateless ingestion engine to group more or fewer input batches together depending on the backlog.

We chose Step Functions for its native integration with many AWS services (including triggering by an Amazon CloudWatch scheduled event rule and adding Amazon EMR steps) and its built-in support for backoff retries and complex state machine logic. For instance, we defined different backoff retry rates based on the type of error.


Our batch-based architecture helped us overcome the transactional processing limitations we originally set out to resolve:

  • Reduced cost – We have been able to scale to thousands of workflows and hundreds of million events per day using only three or four core nodes per EMR cluster. This reduced our Amazon EC2 usage by over 90% when compared with a similar transactional system. Additionally, writing out batches instead of individual transactions reduced the number of Amazon S3 PUT requests by over 99.8%.
  • Data completeness guarantees – Because each input batch is associated with a time interval, when a batch has finished processing, we know that all events in that time interval have been completed.
  • Simplified retry mechanisms – Batch processing means that failures occur at the batch level and can be retried directly through the workflow. Because there are far fewer batches than transactions, batch retries are much more manageable. For instance, in our service, a typical batch contains about two million entries. During a service outage, only a single batch needs to be retried, as opposed to two million individual entries in the legacy architecture.
  • High scalability – We’ve been impressed with how easy it is to scale our EMR clusters on the fly if we detect an increase in traffic. Using Amazon EMR instance fleets also helps us automatically choose the most cost-effective instances across different Availability Zones. We also like the performance achieved by our Lambda-based state indexer. This subcomponent not only dynamically scales with no human intervention, but has also been surprisingly cost-efficient. A large portion of our usage has fallen within the free tier.
  • Operational excellence – Replacing traditional hosts with serverless components such as Lambda allowed us to spend less time on compliance tickets and focus more on delivering features for our customers.

We are particularly excited about the investments we have made moving from a transactional-based system to a batch processing system, especially our shift from using Amazon EC2 to using serverless Lambda and big data Amazon EMR services. This experience demonstrates that even services originally built on AWS can still achieve cost reductions and improve performance by rethinking how AWS services are used.

Inspired by our progress, our team is moving to replace many other legacy services with serverless components. Likewise, we hope that other engineering teams can learn from our experience, continue to innovate, and do more with less.

Find the code used for this post in the following GitHub repository.

Special thanks to development team: Ryan Schwartz, Abhishek Sahay, Cecilia Cho, Godot Bian, Sam Lam, Jean-Christophe Libbrecht, and Nicholas Leong.

About the Authors

Tom Jin is a Senior Software Engineer for eCommerce Financial Integration (eCFI) at Amazon. His interests include building large-scale systems and applying machine learning to healthcare applications. He is based in Vancouver, Canada and is a fan of ocean conservation.

Karthik Odapally is a Senior Solutions Architect at AWS supporting our Gaming Customers. He loves presenting at external conferences like AWS Re:Invent, and helping customers learn about AWS. His passion outside of work is to bake cookies and bread for family and friends here in the PNW. In his spare time, he plays Legend of Zelda (Link’s Awakening) with his 4 yr old daughter.

Financial Crime Discovery using Amazon EKS and Graph Databases

Post Syndicated from Severin Gassauer-Fleissner original https://aws.amazon.com/blogs/architecture/financial-crime-discovery-using-amazon-eks-and-graph-databases/

Discovering and solving financial crimes has become a challenge due to an increasing amount of financial data. While storing transactional payment data in a structured table format is useful for searching, filtering, and calculations, it is not always an ideal way to represent transactional data. For example, determining if there is a suspicious financial relationship between entity A and entity B is difficult to visualize in a table. Using tables, we would have to do SQL joins for every possible transaction from entity A to every possible subsequent transaction. We would then have to iterate this process until we found a relationship to entity B. Moreover, certain queries are challenging to run on a relational database management system (RDBMS). For example, it can be quite time consuming to discover which account received a minimum amount of $10,000,000 from other accounts.

Graph databases such as Amazon Neptune can be helpful with performing queries, because they can traverse the data and perform calculations simultaneously. Graphs enable us to represent transactions and parties over a multi-connected network, and discover patterns and chains of connections. It is common to use them in anti-money laundering (AML) applications, as they can help find patterns of suspicious transactions.

We needed a solution that could scale and process millions of transactions, by effectively using high memory and CPU configurations to perform complex queries quickly. As part of our customer demonstration to show how graph databases can help discover financial crimes, we also sourced a large dataset on which to test the solution. We used a graph database, Amazon Elastic Kubernetes Service (EKS), and Amazon Neptune, to search for suspicious financial chains across large amounts of transactional data in minutes.

Overview of our conceptual financial crime discovery solution

Figure 1. Workflow for financial crime discovery

Figure 1. Workflow for financial crime discovery

We first needed to find a rules engine that could perform transaction inferencing and reasoning. It had to be able to process various rules on our data, be efficient, and able to scale. Next, we needed a straightforward way to ingest data into the solution. Once we had the data available, we needed to initiate a task to begin our inference job. Finally, we needed a location to store the result for further analysis and persistence, see Figure 1.

Using the AWS Cloud to scale up a graph database

We used multiple AWS services to create a fully automated end-to-end batch-based transaction process, shown in Figure 2. We used RDFox, which is an AWS Marketplace product, created by Oxford Semantic Technologies. RDFox is a high-performance in-memory graph database and semantic reasoner. To orchestrate RDFox, we used Amazon EKS Autoscaler to spin up a cluster to instantiate the RDFox container. Amazon EKS can spin up multiple containers for difference inference jobs and recycle the resources when the job finishes. We also used Amazon Neptune, an Amazon managed distributed graph database that can store up to 64 TiB of the results for diagnosis and long-term retention.

The data is stored in Amazon S3 buckets, which provide a streamlined way to feed a large dataset for processing.

Figure 2. Architecture diagram for financial crime discovery

Figure 2. Architecture diagram for financial crime discovery

Financial crimes rules

The power of graphs can help discover financial crimes that are reflected in relationships and monetary transactions. To demonstrate this, we will write two rules to detect two scenarios:

  1. Given two suspicious parties X and Y, find out if there is a transactional relationship between them, and if so, provide the chain that connects them. This is a common scenario that financial institutions must detect.
  2. Given a chain identifying suspicious behavior, find out if the minimum transaction amount that reached the beneficiary exceeds a threshold of Z dollars.

Generating data

To generate data for testing, we used a synthetic data generator written in Python (see GitHub repo in References). The generator built two sets of graph artifacts – parties and transactions. Every transaction is being paired with two random parties, and this iterative process creates a network of connected transactions and parties.

We created a dataset with a small percentage (0.01%) of parties tagged as “Suspicious Party,” to simulate the preceding business scenario. Note that those parties will have transactions going both in and out. In some cases, this will collide with other suspicious parties and establish a chain. This method enables us to get simulated data without engineering the suspicious chains.
The test dataset used with this solution comprises 1M transactions and thousands of parties.
For more information on generating data, see GitHub: Transaction Chains Data Generator.

Walkthrough of the financial investigator workflow

Once deployed, this solution can assist investigators as follows:

  1. An investigator places the transactions and party data (nt triples) in a subdirectory within the input bucket. Typically, subdirectories can be named as a date or range of dates. In addition, the investigator uploads the particular rules (dlog files) and queries (rq files) that must be processed on the data.
  2. Once the data is ready, the investigator uploads a job spec file (simple JSON, see References section). This contains the description of what resources the job requires (CPUs and memory), along with other configurations for the job.
  3. Once the job spec has been uploaded to the bucket’s subdirectory, the job is automatically initiated. The Kubernetes scheduler will allocate enough resources to initiate the RDFox pod. The containers in the pod will then load, process, query results, and upload them to the output bucket.
  4. Once the data reaches the output bucket, an AWS Lambda function is initiated. This invokes the Amazon Neptune Bulk Loader, which asynchronously loads the results to the Neptune cluster in a parallelized manner.
  5. Once the load completes, the investigator gets an email notification that the job has been completed, and the results are ready for view.


  • The investigator can upload multiple rules and queries, they will all be processed automatically.
  • The investigator can launch multiple jobs with different/same data, and with different rules and queries at the same time.
  • All jobs outputs are saved in a unique job ID subdirectory in the output bucket.

To create the solution in your account, follow the instruction here: GitHub repo


For this walkthrough, you should have the following prerequisites:


We create two materialization rule sets to fetch the two scenarios described.

1. detect-suspicious-parties-pair.dlog

The purpose of this first set of rules is to detect chains that might exist between two suspicious parties. The idea of these chains is to represent all the possible relationships that contain monetary transfers between a suspicious originator and the beneficiary. This will include non-suspicious parties in the chain. The rule tags these chains with the “SuspiciousChain” flag.

2. detect-chains-exceed-100-dollars.dlog

This set of rules is designed to tag the chains identified by the first set of rules. It also contains a minimum amount of $100 passed to the beneficiary. We can change the amount to check for different compliance requirements. We tag those chains as “HighValueChain.”

Run the job and check your results

Now we can run our job, with the given data, rules, and two additional queries (to extract “SuspiciousChain” and “HighValueChain” respectively). The result of the queries will be loaded to Amazon Neptune automatically for persistent storage, and is made durably available for further analysis.

Let’s look at the results. The following query can be initiated against RDFox console or Amazon Neptune.


PREFIX : <http://oxfordsemantic.tech/transactions/entities#>

PREFIX prop: <http://oxfordsemantic.tech/transactions/properties#>

PREFIX type: <http://oxfordsemantic.tech/transactions/classes#>

PREFIX tt: <http://oxfordsemantic.tech/transactions/tupletables#>


            ?S a type:SuspiciousChain .

            ?S ?P ?O .


Figure 3. Visualizing suspicious chains

Figure 3. Visualizing suspicious chains

Whoa! Figure 3 might look complicated at first, but this is because we are visualizing every pair of suspicious parties that have a relationship with another suspicious party. Let’s filter the query to look only at a single particular chain, which exceeds a minimum of $100 to the beneficiary. The following query can be executed against RDFox console or Amazon Neptune.

PREFIX : <http://oxfordsemantic.tech/transactions/entities#>
PREFIX prop: <http://oxfordsemantic.tech/transactions/properties#>
PREFIX type: <http://oxfordsemantic.tech/transactions/classes#>
PREFIX tt: <http://oxfordsemantic.tech/transactions/tupletables#>

?S ?P ?O
?S a type:HighValueChain .

} Limit 1


Figure 4. Visualizing a particular chain

Figure 4. Visualizing a particular chain

In Figure 4, we can see that Allison, the suspicious originator of the chain, has sent a transaction to Troy. Troy, who is not suspicious, sent the transaction to Karina. Karina is the suspicious beneficiary. We can also see additional information, such as the transaction amount that Karina received, and the chain length of 2 in this case.

In our testing, we were able to scale up to 500M transactions with 50M parties and process this in less than two hours! And we performed this at a significant lower cost when compared to running constant, fixed similar hardware.

Cleaning up

Follow Terraform cleanup instructions.


Graph databases are a powerful tool to apply reasoning on complex financial relationships. The combination of Amazon Web Services and the RDFox engine results in an automated, scalable, and cost-effective, thanks to the dynamic Kubernetes Cluster Autoscaler. Customers can use this solution and provide their investigators with a tool they can experiment and reason on financial transactions. This solution simplifies the process, and makes it easier to try different rules and queries on complex large data collections.

This blog post is written with Oxford Semantic.

Oxford Semantic Logo



Further reading



Comprehensive Cyber Security Framework for Primary (Urban) Cooperative Banks (UCBs)

Post Syndicated from Vikas Purohit original https://aws.amazon.com/blogs/security/comprehensive-cyber-security-framework-for-primary-urban-cooperative-banks/

We are pleased to announce a new Amazon Web Services (AWS) workbook designed to help India Primary (UCBs) customers align with the Reserve Bank of India (RBI) guidance in Comprehensive Cyber Security Framework for Primary (Urban) Cooperative Banks (UCBs) – A Graded Approach.

In addition to RBI’s basic cyber security framework for Primary (Urban) Cooperative Banks (UCBs), RBI issued guidance on its comprehensive cyber security framework, which sets the expectations for the Indian Primary UCBs regarding their cyber security frameworks. This guidance divides the framework into four levels, starting with a common level that applies to all UCBs; the remaining levels apply to specific UCBs based upon their digital depth, and interconnectedness to the payment systems landscape based on RBI-defined criteria. The guidance aims to increase the awareness among the Primary UCBs in India of the controls they should look for as they progress on their digital journey.

Security and compliance is a shared responsibility between AWS and the customer. This differentiation of responsibility is commonly referred to as the AWS Shared Responsibility Model, in which AWS is responsible for security of the cloud, and the customer is responsible for their security in the cloud.

The new AWS Comprehensive Cyber Security Framework for Primary (Urban) Cooperative Banks (UCBs) – A Graded Approach workbook helps customers align with the RBI cyber security framework by providing control mappings for the following:

The downloadable AWS RBI Comprehensive Cyber Security Framework for Primary UCBs workbook is available in AWS Artifact, a self-service portal for on-demand access to AWS Compliance Reports, and it contains two embedded formats:

  • Microsoft Excel: Coverage includes AWS responsibility control statements and Well-Architected Framework best practices
  • Dynamic HTML: Coverage is the same as in the Microsoft Excel format, with the added feature that the Well Architected Framework best practices are mapped to AWS Config managed rules and Amazon GuardDuty findings, where available or applicable.

The AWS RBI Comprehensive Cyber Security Framework for Primary UCBs and AWS RBI Basic Cyber Security Framework for Primary UCBs Workbook are available for download in AWS Artifact. Sign into AWS Artifact via the AWS Management Console, or learn more at Getting Started with AWS Artifact.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Vikas Purohit

Vikas Purohit

Vikas works as a Partner Solution Architect with AISPL, India. He helps about helping customers and partners in their cloud journeys. He is particularly passionate in Cloud Security, hybrid networking and migrations.

2021 FINMA ISAE 3000 Type 2 attestation report for Switzerland now available on AWS Artifact

Post Syndicated from Niyaz Noor original https://aws.amazon.com/blogs/security/2021-finma-isae-3000-type-2-attestation-report-for-switzerland-now-available-on-aws-artifact/

AWS is pleased to announce the issuance of a second Swiss Financial Market Supervisory Authority (FINMA) ISAE 3000 Type 2 attestation report. The latest report covers the period from October 1, 2020 to September 30, 2021, with a total of 141 AWS services and 23 global AWS Regions included in the scope.

A full list of certified services and Regions are presented within the published FINMA report; customers can download the latest report from AWS Artifact.

The FINMA ISAE 3000 Type 2 report, conducted by an independent third-party audit firm, provides Swiss financial industry customers with the assurance that the AWS control environment is appropriately designed and implemented to address key operational risks, as well as risks related to outsourcing and business continuity management.

FINMA circulars

The report covers the five core FINMA circulars applicable to Swiss banks and insurers in the context of outsourcing arrangements to the cloud. These FINMA circulars are intended to assist Swiss-regulated financial institutions in understanding approaches to due diligence, third-party management, and key technical and organizational controls that should be implemented in cloud outsourcing arrangements, particularly for material workloads.

The report’s scope covers, in detail, the requirements of the following FINMA circulars:

  • 2018/03 Outsourcing – banks, insurance companies and selected financial institutions under FinIA;
  • 2008/21 Operational Risks – Banks – Principle 4 Technology Infrastructure (31.10.2019);
  • 2008/21 Operational Risks – Banks – Appendix 3 Handling of electronic Client Identifying Data (31.10.2019);
  • 2013/03 Auditing – Information Technology (04.11.2020);
  • 2008/10 Self-regulation as a minimum standard – Minimum Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013);

Customers can continue to use the detailed FINMA workbooks that include detailed control mappings for each FINMA circular covered under this audit report; these workbooks are available on AWS Artifact. Where applicable, under the AWS shared responsibility model, these workbooks provide best practices guidance using AWS Well-Architected to assist Swiss customers in their own preparation for alignment with FINMA circulars.

As always, AWS is committed to bringing new services into the future scope of our FINMA program based on customers’ architectural and regulatory needs. Please reach out to your AWS account team if you have questions or feedback about the FINMA report.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.


Niyaz Noor

Niyaz is the Security Audit Program Manager at AWS. Niyaz leads multiple security certification programs across Europe and other regions. During his professional career, he has helped multiple cloud service providers in obtaining global and regional security certification. He is passionate about delivering programs that build customers’ trust and provide them assurance on cloud security.

New AWS workbook for New Zealand financial services customers

Post Syndicated from Julian Busic original https://aws.amazon.com/blogs/security/new-aws-workbook-for-new-zealand-financial-services-customers/

We are pleased to announce a new AWS workbook designed to help New Zealand financial services customers align with the Reserve Bank of New Zealand (RBNZ) Guidance on Cyber Resilience.

The RBNZ Guidance on Cyber Resilience sets out the RBNZ expectations for its regulated entities regarding cyber resilience, and aims to raise awareness and promote the cyber resilience of the financial sector, especially at board and senior management level. The guidance applies to all entities regulated by the RBNZ, including registered banks, licensed non-bank deposit takers, licensed insurers, and designated financial market infrastructures.

While the RBNZ describes its guidance as “a set of recommendations rather than requirements” which are not legally enforceable, it also states that it expects regulated entities to “proactively consider how their current approach to cyber risk management lines up with the recommendations in [the] guidance and look for [opportunities] for improvement as early as possible.”

Security and compliance is a shared responsibility between AWS and the customer. This differentiation of responsibility is commonly referred to as the AWS Shared Responsibility Model, in which AWS is responsible for security of the cloud, and the customer is responsible for their security in the cloud. The new AWS Reserve Bank of New Zealand Guidance on Cyber Resilience (RBNZ-GCR) Workbook helps customers align with the RBNZ Guidance on Cyber Resilience by providing control mappings for the following:

  • Security in the cloud by mapping RBNZ Guidance on Cyber Resilience practices to the five pillars of the AWS Well-Architected Framework.
  • Security of the cloud by mapping RBNZ Guidance on Cyber Resilience practices to control statements from the AWS Compliance Program.

The downloadable AWS RBNZ-GCR Workbook contains two embedded formats:

  • Microsoft Excel – Coverage includes AWS responsibility control statements and Well-Architected Framework best practices.
  • Dynamic HTML – Coverage is the same as in the Microsoft Excel format, with the added feature that the Well-Architected Framework best practices are mapped to AWS Config managed rules and Amazon GuardDuty findings, where available or applicable.

The AWS RBNZ-GCR Workbook is available for download in AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Julian Busic

Julian is a Security Solutions Architect with a focus on regulatory engagement. He works with our customers, their regulators, and AWS teams to help customers raise the bar on secure cloud adoption and usage. Julian has over 15 years of experience working in risk and technology across the financial services industry in Australia and New Zealand.

Build Your Own Game Day to Support Operational Resilience

Post Syndicated from Lewis Taylor original https://aws.amazon.com/blogs/architecture/build-your-own-game-day-to-support-operational-resilience/

Operational resilience is your firm’s ability to provide continuous service through people, processes, and technology that are aware of and adaptive to constant change. Downtime of your mission-critical applications can not only damage your reputation, but can also make you liable to multi-million-dollar financial fines.

One way to test operational resilience is to simulate life-like system failures. An effective way to do this is by running events in your organization known as game days. Game days test systems, processes, and team responses and help evaluate your readiness to react and recover from operational issues. The AWS Well-Architected Framework recommends game days as a key strategy to develop and operate highly resilient systems because they focus not only on technology resilience issues but identify people and process gaps.

This blog post will explain how you can apply game day concepts to your workloads to help achieve a highly resilient workload.

Why does operational resilience matter from a regulatory perspective?

In March 2021, the Bank of England, Prudential Regulation Authority, and Financial Conduct Authority published their Building operational resilience: Feedback to CP19/32 and final rules policy. In this policy, operational resilience refers to a firm’s ability to prevent, adapt, and respond to and return to a steady system state when a disruption occurs. Further, firms are expected to learn and implement process improvements from prior disruptions.

This policy will not apply to everyone. However, across the board if you don’t establish operational resilience strategies, you are likely operating at an increased risk. If you have a service disruption, you may incur lost revenue and reputational damage.

What does it mean to be operationally resilient?

The final policy provides guidance on how firms should achieve operational resilience, which includes but is not limited to the following:

  • Identify and prioritize services based on the potential of intolerable harm to end consumers or risk to market integrity.
  • Define appropriate maximum impact tolerance of an important business service. This is reviewed annually using metrics to measure impact tolerance and answers questions like, “How long (in hours) can a service be offline before causing intolerable harm to end consumers?”
  • Document a complete view of all the aspects required to deliver each important service. This includes people, processes, technology, facilities, and information (resources). Firms should also test their ability to remain within the impact tolerances and provide assurance of resilience along with areas that need to be addressed.

What is a game day?

The AWS Well-Architected Framework defines a game day as follows:

“A game day simulates a failure or event to test systems, processes, and team responses. The purpose is to actually perform the actions the team would perform as if an exceptional event happened. These should be conducted regularly so that your team builds “muscle memory” on how to respond. Your game days should cover the areas of operations, security, reliability, performance, and cost.

In AWS, your game days can be carried out with replicas of your production environment using AWS CloudFormation. This enables you to test in a safe environment that resembles your production environment closely.”

Running game days that simulate system failure helps your organization evaluate and build operational resilience.

How can game days help build operational resilience?

Running a game day alone is not sufficient to ensure operational resilience. However, by navigating the following process to set up and perform a game day, you will establish a best practice-based approach for operating resilient systems.

Stage 1 – Identify key services

As part of setting up a game day event, you will catalog and identify business-critical services.

Game days are performed to test services where operational failure could result in significant financial, customer, and/or reputational impact to the firm. Game days can also evaluate other key factors, like the impact of a failure on the wider market where your firm operates.

For example, a firm may identify its digital banking mobile application from which their customers can initiate payments as one of its important business services.

Stage 2 – Map people, process, and technology supporting the business service

Game days are holistic events. To get a full picture of how the different aspects of your workload operate together, you’ll generate a detailed map of people and processes as they interact and operate the technical and non-technical components of the system. This mapping also helps your end consumers understand how you will provide them reliable support during a failure.

Stage 3 – Define and perform failure scenarios

Systems fail, and failures often happen when a system is operating at scale because various services working together can introduce complexity. To ensure operational resilience, you must understand how systems react and adapt to failures. To do this, you’ll identify and perform failure scenarios so you can understand how your systems will react and adapt and build “muscle memory” for actual events.

AWS builds to guard against outages and incidents, and accounts for them in the design of AWS services—so when disruptions do occur, their impact on customers and the continuity of services is as minimal as possible. At AWS, we employ compartmentalization throughout our infrastructure and services. We have multiple constructs that provide different levels of independent, redundant components.

Stage 4 – Observe and document people, process, and technology reactions

In running a failure scenario, you’ll observe how technological and non-technological components react to and recover from failure. This helps you identify failures and fix them as they cascade through impacted components across your workload. This also helps identify technical and operational challenges that might not otherwise be obvious.

Stage 5 – Conduct lessons learned exercises

Game days generate information on people, processes, and technology and also capture data on customer impact, incident response and remediation timelines, contributing factors, and corrective actions. By incorporating these data points into the system design process, you can implement continuous resilience for critical systems.

How to run your own game day in AWS

You may have heard of AWS GameDay events. This is an AWS organized event for our customers. In this team-based event, AWS provides temporary AWS accounts running fictional systems. Failures are injected into these systems and teams work together on completing challenges and improving the system architecture.

However, the method and tooling and principles we use to conduct AWS GameDays are agnostic and can be applied to your systems using the following services:

  • AWS Fault Injection Simulator is a fully managed service that runs fault injection experiments on AWS, which makes it easier to improve an application’s performance, observability, and resiliency.
  • Amazon CloudWatch is a monitoring and observability service that provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health.
  • AWS X-Ray helps you analyze and debug production and distributed applications (such as those built using a microservices architecture). X-Ray helps you understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors.

Please note you are not limited to the tools listed for simulating failure scenarios. For complete coverage of failure scenarios, we encourage you to explore additional tools and strategies.

Figure 1 shows a reference architecture example that demonstrates conducting a game day for an Open Banking implementation.

Game day reference architecture example

Figure 1. Game day reference architecture example

Game day operators use Fault Injection Simulator to catalog and perform failure scenarios to be included in your game day. For example, in our Open Banking use case in Figure 1, a failure scenario might be for the business API functions servicing Open Banking requests to abruptly stop working. You can also combine such simple failure scenarios into a more complex one with failures injected across multiple components of the architecture.

Game day participants use CloudWatch, X-Ray, and their own custom observability and monitoring tooling to identify failures as they cascade through systems.

As you go through the process of identifying, communicating, and fixing issues, you’ll also document impact of failures on end-users. From there, you’ll generate lessons learned to holistically improve your workload’s resilience.


In this blog, we discussed the significance of ensuring operational resilience. We demonstrated how to set up game days and how they can supplement your efforts to ensure operational resilience. We discussed how using AWS services such as Fault Injection Simulator, X-Ray, and CloudWatch can be used to facilitate and implement game day failure scenarios.

Ready to get started? For more information, check out our AWS Fault Injection Simulator User Guide.

Related information:

Disaster recovery compliance in the cloud, part 2: A structured approach

Post Syndicated from Dan MacKay original https://aws.amazon.com/blogs/security/disaster-recovery-compliance-in-the-cloud-part-2-a-structured-approach/

Compliance in the cloud is fraught with myths and misconceptions. This is particularly true when it comes to something as broad as disaster recovery (DR) compliance where the requirements are rarely prescriptive and often based on legacy risk-mitigation techniques that don’t account for the exceptional resilience of modern cloud-based architectures. For regulated entities subject to principles-based supervision such as many financial institutions (FIs), the responsibility lies with the FI to determine what’s necessary to adequately recover from a disaster event. Without clear instructions, FIs are susceptible to making incorrect assumptions regarding their compliance requirements for DR.

In Part 1 of this two-part series, I provided some examples of common misconceptions FIs have about compliance requirements for disaster recovery in the cloud. In Part 2, I outline five steps you can take to avoid these misconceptions when architecting DR-compliant workloads for deployment on Amazon Web Services (AWS).

1. Identify workloads planned for deployment

It’s common for FIs to have a portfolio of workloads they are considering deploying to the cloud and often want to know that they can be compliant across the board. But compliance isn’t a one-size-fits-all domain—it’s based on the characteristics of each workload. For example, does the workload contain personally identifiable information (PII)? Will it be used to store, process, or transmit credit card information? Compliance is dependent on the answers to questions such as these and must be assessed on a case-by-case basis. Therefore, the first step in architecting for compliance is to identify the specific workloads you plan to deploy to the cloud. This way, you can assess the requirements of these specific workloads and not be distracted by aspects of compliance that might not be relevant.

2. Define the workload’s resiliency requirements

Resiliency is the ability of a workload to recover from infrastructure or service disruptions. DR is an important part of your resiliency strategy and concerns how your workload responds to a disaster event. DR strategies on AWS range from simple, low cost options such as backup and restore, to more complex options such as multi-site active-active, as shown in Figure 1.

For more information, I encourage you to read Seth Eliot’s blog series on DR Architecture on AWS as well as the AWS whitepaper Disaster Recovery of Workloads on AWS: Recovery in the Cloud.

The DR strategy you choose for a particular workload is dependent on your organization’s requirements for avoiding loss of data—known as the recovery point objective (RPO)—and reducing downtime where the workload isn’t available —known as the recovery time objective (RTO). RPO and RTO are key factors for determining the minimum architectural specifications necessary to meet the workload’s resiliency requirements. For example, can the workload’s RPO and RTO be achieved using a multi-AZ architecture in a single AWS Region, or do the resiliency requirements necessitate deploying the workload across multiple AWS Regions? Even if your workload is not subject to explicit compliance requirements for resiliency, understanding these requirements is necessary for assessing other aspects of DR compliance, including data residency and geodiversity.

3. Confirm the workload’s data residency requirements

As I mentioned in Part 1, data residency requirements might restrict which AWS Region or Regions you can deploy your workload to. Therefore, you need to confirm whether the workload is subject to any data residency requirements within applicable laws and regulations, corporate policies, or contractual obligations.

In order to properly assess these requirements, you must review the explicit language of the requirements so as to understand the specific constraints they impose. You should also consult legal, privacy, and compliance subject-matter specialists to help you interpret these requirements based on the characteristics of the workload. For example, do the requirements specifically state that the data cannot leave the country, or can the requirement be met so long as the data can be accessed from that country? Does the requirement restrict you from storing a copy of the data in another country—for example, for backup and recovery purposes? What if the data is encrypted and can only be read using decryption keys kept within the home country? Consulting subject-matter specialists to help interpret these requirements can help you avoid making overly restrictive assumptions and imposing unnecessary constraints on the workload’s architecture.

4. Confirm the workload’s geodiversity requirements

A single Region, multiple-AZ architecture is often sufficient to meet a workload’s resiliency requirements. However, if the workload is subject to geodiversity requirements, the distance between the AZs in an AWS Region might not conform to the minimum distance between individual data centers specified by the requirements. Therefore, it’s critical to confirm whether any geodiversity requirements apply to the workload.

Like data residency, it’s important to assess the explicit language of geodiversity requirements. Are they written down in a regulation or corporate policy, or are they just a recommended practice? Can the requirements be met if the workload is deployed across three or more AZs even if the minimum distance between those AZs is less than the specified minimum distance between the primary and backup data centers? If it’s a corporate policy, does it allow for exceptions if an alternative method provides equal or greater resiliency than asynchronous replication between two geographically distant data centers? Or perhaps the corporate policy is outdated and should be revised to reflect modern risk mitigation techniques. Understanding these parameters can help you avoid unnecessary constraints as you assess architectural options for your workloads.

5. Assess architectural options to meet the workload’s requirements

Now that you understand the workload’s requirements for resiliency, data residency, and geodiversity, you can assess the architectural options that meet these requirements in the cloud.

As per AWS Well-Architected best practices, you should strive for the simplest architecture necessary to meet your requirements. This includes assessing whether the workload can be accommodated within a single AWS Region. If the workload is constrained by explicit geographic diversity requirements or has resiliency requirements that cannot be accommodated by a single AWS Region, then you might need to architect the workload for deployment across multiple AWS Regions. If the workload is also constrained by explicit data residency requirements, then it might not be possible to deploy to multiple AWS Regions. In cases such as these, you can work with our AWS Solution Architects to assess hybrid options that might meet your compliance requirements, such as using AWS Outposts, Amazon Elastic Container Service (Amazon ECS) Anywhere, or Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere. Another option may be to consider a DR solution in which your on-premises infrastructure is used as a backup for a workload running on AWS. In some cases, this might be a long-term solution. In others, it might be an interim solution until certain constraints can be removed—for example, a change to corporate policy or the introduction of additional AWS Regions in a particular country.


Let’s recap by summarizing some guiding principles for architecting compliant DR workloads as outlined in this two-part series:

  • Avoid assumptions; confirm the facts. If it’s not written down, it’s unlikely to be considered a mandatory compliance requirement.
  • Consult the experts. Legal, privacy, and compliance, as well as AWS Solution Architects, AWS security and compliance specialists, and other subject-matter specialists.
  • Avoid generalities; focus on the specifics. There is no one-size-fits-all approach.
  • Strive for simplicity, not zero risk. Don’t use multiple AWS Regions when one will suffice.
  • Don’t get distracted by exceptions. Focus on your current requirements, not workloads you’re not yet prepared to deploy to the cloud.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Dan MacKay

Dan is the Financial Services Compliance Specialist for AWS Canada. As a member of the Worldwide Financial Services Security & Compliance team, Dan advises financial services customers on best practices and practical solutions for cloud-related governance, risk, and compliance. He specializes in helping AWS customers navigate financial services and privacy regulations applicable to the use of cloud technology in Canada.

Disaster recovery compliance in the cloud, part 1: Common misconceptions

Post Syndicated from Dan MacKay original https://aws.amazon.com/blogs/security/disaster-recovery-compliance-in-the-cloud-part-1-common-misconceptions/

Compliance in the cloud can seem challenging, especially for organizations in heavily regulated sectors such as financial services. Regulated financial institutions (FIs) must comply with laws and regulations (often in multiple jurisdictions), global security standards, their own corporate policies, and even contractual obligations with their customers and counterparties. These various compliance requirements may impose constraints on how their workloads can be architected for the cloud, and may require interpretation on what FIs must do in order to be compliant. It’s common for FIs to make assumptions regarding their compliance requirements, which can result in unnecessary costs and increased complexity, and might not align with their strategic objectives. A modern, rationalized approach to compliance can help FIs avoid imposing unnecessary constraints while meeting their mandatory requirements.

In my role as an Amazon Web Services (AWS) Compliance Specialist, I work with our financial services customers to identify, assess, and determine solutions to address their compliance requirements as they move to the cloud. One of the most common challenges customers ask me about is how to comply with disaster recovery (DR) requirements for workloads they plan to run in the cloud. In this blog post, I share some of the typical misconceptions FIs have about DR compliance in the cloud. In Part 2, I outline a structured approach to designing compliant architectures for your DR workloads. As my primary market is Canada, the examples in this blog post largely pertain to FIs operating in Canada, but the principles and best practices are relevant to regulated organizations in any country.

“Why isn’t there a checklist for compliance in the cloud?”

Compliance requirements are sometimes prescriptive: “if X, then you must do Y.” When requirements are prescriptive, it’s usually clear what you must do in order to be compliant. For example, the Payment Card Industry Data Security Standard (PCI DSS) requirement 8.2.4 obliges companies that process, store, or transmit credit card information to “change user passwords/passphrases at least once every 90 days.” But in the financial services sector, compliance requirements for managing operational risks can be subjective. When regulators take what is known as a principles-based approach to setting regulatory expectations, each FI is required to assess their specific risks and determine the mitigating controls necessary to conform with the organization’s tolerance for operational risk. Because the rules aren’t prescriptive, there is no “checklist for achieving compliance.” Instead, principles-based requirements are guidelines that FIs are expected to consider as they design and implement technology solutions. They are, by definition, subject to interpretation and can be prone to myths and misconceptions among FIs and their service providers. To illustrate this, let’s look at two aspects of DR that are frequently misunderstood within the Canadian financial services industry: data residency and geodiversity.

“My data has to stay in country X”

Data residency or data localization is a requirement for specific data-sets processed and stored in an IT system to remain within a specific jurisdiction (for example, a country). As discussed in our Policy Perspectives whitepaper, contrary to historical perspectives, data residency doesn’t provide better security. Most cyber-attacks are perpetrated remotely and attackers aren’t deterred by the physical location of their victims. In fact, data residency can run counter to an organization’s objectives for security and resilience. For example, data residency requirements can limit the options our customers have when choosing the AWS Region or Regions in which to run their production workloads. This is especially challenging for customers who want to use multiple Regions for backup and recovery purposes.

It’s common for FIs operating in Canada to assume that they’re required to keep their data—particularly customer data—in Canada. In reality, there’s very little from a statutory perspective that imposes such a constraint. None of the private sector privacy laws include data residency requirements, nor do any of the financial services regulatory guidelines. There are some place of records requirements in Canadian federal financial services legislation such as The Bank Act and The Insurance Companies Act, but these are relatively narrow in scope and apply primarily to corporate records. For most Canadian FIs, their requirements are more often a result of their own corporate policies or contractual obligations, not externally imposed by public policies or regulations.

“My data centers have to be X kilometers apart”

Geodiversity—short for geographic diversity—is the concept of maintaining a minimum distance between primary and backup data processing sites. Geodiversity is based on the principle that requiring a certain distance between data centers mitigates the risk of location-based disruptions such as natural disasters. The principle is still relevant in a cloud computing context, but is not the only consideration when it comes to planning for DR. The cloud allows FIs to define operational resilience requirements instead of limiting themselves to antiquated business continuity planning and DR concepts like physical data center implementation requirements. Legacy disaster recovery solutions and architectures, and lifting and shifting such DR strategies into the cloud, can diminish the potential benefits of using the cloud to improve operational resilience. Modernizing your information technology also means modernizing your organization’s approach to DR.

In the cloud, vast physical distance separation is an anti-pattern—it’s an arbitrary metric that does little to help organizations achieve availability and recovery objectives. At AWS, we design our global infrastructure so that there’s a meaningful distance between the Availability Zones (AZs) within an AWS Region to support high availability, but close enough to facilitate synchronous replication across those AZs (an AZ being a cluster of data centers). Figure 1 shows the relationship between Regions, AZs, and data centers.

Synchronous replication across multiple AZs enables you to minimize data loss (defined as the recovery point objective or RPO) and reduce the amount of time that workloads are unavailable (defined as the recovery time objective or RTO). However, the low latency required for synchronous replication becomes less achievable as the distance between data centers increases. Therefore, a geodiversity requirement that mandates a minimum distance between data centers that’s too far for synchronous replication might prohibit you from taking advantage of AWS’s multiple-AZ architecture. A multiple-AZ architecture can achieve RTOs and RPOs that aren’t possible with a simple geodiversity mitigation strategy. For more information, refer to the AWS whitepaper Disaster Recovery of Workloads on AWS: Recovery in the Cloud.

Again, it’s a common perception among Canadian FIs that the disaster recovery architecture for their production workloads must comply with specific geodiversity requirements. However, there are no statutory requirements applicable to FIs operating in Canada that mandate a minimum distance between data centers. Some FIs might have corporate policies or contractual obligations that impose geodiversity requirements, but for most FIs I’ve worked with, geodiversity is usually a recommended practice rather than a formal policy. Informal corporate guidelines can have some value, but they aren’t absolute rules and shouldn’t be treated the same as mandatory compliance requirements. Otherwise, you might be unintentionally restricting yourself from taking advantage of more effective risk management techniques.

“But if it is a compliance requirement, doesn’t that mean I have no choice?”

Both of the previous examples illustrate the importance of not only confirming your compliance requirements, but also recognizing the source of those requirements. It might be infeasible to obtain an exception to an externally-imposed obligation such as a regulatory requirement, but exceptions or even revisions to corporate policies aren’t out of the question if you can demonstrate that modern approaches provide equal or greater protection against a particular risk—for example, the high availability and rapid recoverability supported by a multiple-AZ architecture. Consider whether your compliance requirements provide for some level of flexibility in their application.

Also, because many of these requirements are principles-based, they might be subject to interpretation. You have to consider the specific language of the requirement in the context of the workload. For example, a data residency requirement might not explicitly prohibit you from storing a copy of the content in another country for backup and recovery purposes. For this reason, I recommend that you consult applicable specialists from your legal, privacy, and compliance teams to aid in the interpretation of compliance requirements. Once you understand the legal boundaries of your compliance requirements, AWS Solutions Architects and other financial services industry specialists such as myself can help you assess viable options to meet your needs.


In this first part of a two-part series, I provided some examples of common misconceptions FIs have about compliance requirements for disaster recovery in the cloud. The key is to avoid making assumptions that might impose greater constraints on your architecture than are necessary. In Part 2, I show you a structured approach for architecting compliant DR workloads that can help you to avoid these preventable missteps.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Dan MacKay

Dan is the Financial Services Compliance Specialist for AWS Canada. As a member of the Worldwide Financial Services Security & Compliance team, Dan advises financial services customers on best practices and practical solutions for cloud-related governance, risk, and compliance. He specializes in helping AWS customers navigate financial services and privacy regulations applicable to the use of cloud technology in Canada.