Tag Archives: Financial Services

New AWS whitepaper: AWS User Guide for Federally Regulated Financial Institutions in Canada

2024-02-29 Dan MacKay

Post Syndicated from Dan MacKay original https://aws.amazon.com/blogs/security/new-aws-whitepaper-aws-user-guide-for-federally-regulated-financial-institutions-in-canada/

Amazon Web Services (AWS) has released a new whitepaper to help financial services customers in Canada accelerate their use of the AWS Cloud.

The new AWS User Guide for Federally Regulated Financial Institutions in Canada helps AWS customers navigate the regulatory expectations of the Office of the Superintendent of Financial Institutions (OSFI) in a shared responsibility environment. It is intended for OSFI-regulated institutions that are looking to run material workloads in the AWS Cloud, and is particularly useful for leadership, security, risk, and compliance teams that need to understand OSFI requirements and guidance applicable to the use of AWS services.

This whitepaper summarizes OSFI’s expectations with respect to Technology and Cyber Risk Management (OSFI Guideline B-13). It also gives OSFI-regulated institutions information that they can use to commence their due diligence and assess how to implement the appropriate programs for their use of AWS Cloud services. In subsequent versions of the whitepaper, we will provide considerations for other OSFI guidelines as applicable.

In addition to this whitepaper, AWS provides updates on the evolving Canadian regulatory landscape on the AWS Security Blog and the AWS Compliance page. Customers looking for more information on cloud-related regulatory compliance in different countries around the world can refer to the AWS Compliance Center. For additional resources or support, reach out to your AWS account manager or contact us here.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

AWS Payment Cryptography is PCI PIN and P2PE certified

2024-02-27 Tim Winston

Post Syndicated from Tim Winston original https://aws.amazon.com/blogs/security/aws-payment-cryptography-is-pci-pin-and-p2pe-certified/

Amazon Web Services (AWS) is pleased to announce that AWS Payment Cryptography is certified for Payment Card Industry Personal Identification Number (PCI PIN) version 3.1 and as a PCI Point-to-Point Encryption (P2PE) version 3.1 Decryption Component.

With Payment Cryptography, your payment processing applications can use payment hardware security modules (HSMs) that are PCI PIN Transaction Security (PTS) HSM certified and fully managed by AWS, with PCI PIN and P2PE-compliant key management. These attestations give you the flexibility to deploy your regulated workloads with reduced compliance overhead.

The PCI P2PE Decryption Component enables PCI P2PE Solutions to use AWS to decrypt credit card transactions from payment terminals, and PCI PIN attestation is required for applications that process PIN-based debit transactions. According to PCI, “Use of a PCI P2PE Solution can also allow merchants to reduce where and how the PCI DSS applies within their retail environment, increasing security of customer data while simplifying compliance with the PCI DSS”.

Coalfire, a third-party Qualified PIN Assessor (QPA) and Qualified Security Assessor (P2PE), evaluated Payment Cryptography. Customers can access the PCI PIN Attestation of Compliance (AOC) report, the PCI PIN Shared Responsibility Summary, and the PCI P2PE Attestation of Validation through AWS Artifact.

To learn more about our PCI program and other compliance and security programs, see the AWS Compliance Programs page. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

2024-02-27 Toney Thomas

Post Syndicated from Toney Thomas original https://aws.amazon.com/blogs/big-data/empowering-data-driven-excellence-how-the-bluestone-data-platform-embraced-data-mesh-for-success/

This post is co-written with Toney Thomas and Ben Vengerovsky from Bluestone.

In the ever-evolving world of finance and lending, the need for real-time, reliable, and centralized data has become paramount. Bluestone, a leading financial institution, embarked on a transformative journey to modernize its data infrastructure and transition to a data-driven organization. In this post, we explore how Bluestone uses AWS services, notably the cloud data warehousing service Amazon Redshift, to implement a cutting-edge data mesh architecture, revolutionizing the way they manage, access, and utilize their data assets.

The challenge: Legacy to modernization

Bluestone was operating with a legacy SQL-based lending platform, as illustrated in the following diagram. To stay competitive and responsive to changing market dynamics, they decided to modernize their infrastructure. This modernization involved transitioning to a software as a service (SaaS) based loan origination and core lending platforms. Because these new systems produced vast amounts of data, the challenge of ensuring a single source of truth for all data consumers emerged.

Birth of the Bluestone Data Platform

To address the need for centralized, scalable, and governable data, Bluestone introduced the Bluestone Data Platform. This platform became the hub for all data-related activities across the organization. AWS played a pivotal role in bringing this vision to life.

The following are the key components of the Bluestone Data Platform:

Data mesh architecture – Bluestone adopted a data mesh architecture, a paradigm that distributes data ownership across different business units. Each data producer within the organization has its own data lake in Apache Hudi format, ensuring data sovereignty and autonomy.
Four-layered data lake and data warehouse architecture – The architecture comprises four layers, including the analytical layer, which houses purpose-built facts and dimension datasets that are hosted in Amazon Redshift. These datasets are pivotal for reporting and analytics use cases, powered by services like Amazon Redshift and tools like Power BI.
Machine learning analytics – Various business units, such as Servicing, Lending, Sales & Marketing, Finance, and Credit Risk, use machine learning analytics, which run on top of the dimensional model within the data lake and data warehouse. This enables data-driven decision-making across the organization.
Governance and self-service – The Bluestone Data Platform provides a governed, curated, and self-service avenue for all data use cases. AWS services like AWS Lake Formation in conjunction with Atlan help govern data access and policies.
Data quality framework – To ensure data reliability, they implemented a data quality framework. It continuously assesses data quality and syncs quality scores to the Atlan governance tool, instilling confidence in the data assets within the platform.

The following diagram illustrates the architecture of their updated data platform.

AWS and third-party services

AWS played a pivotal and multifaceted role in empowering Bluestone’s Data Platform to thrive. The following AWS and third-party services were instrumental in shaping Bluestone’s journey toward becoming a data-driven organization:

Amazon Redshift – Bluestone harnessed the power of Amazon Redshift and its features like data sharing to create a centralized repository of data assets. This strategic move facilitated seamless data sharing and collaboration across diverse business units, paving the way for more informed and data-driven decision-making.
Lake Formation – Lake Formation emerged as a cornerstone in Bluestone’s data governance strategy. It played a critical role in enforcing data access controls and implementing data policies. With Lake Formation, Bluestone achieved protection of sensitive data and compliance with regulatory requirements.
Data quality monitoring – To maintain data reliability and accuracy, Bluestone deployed a robust data quality framework. AWS services were essential in this endeavor, because they complemented open source tools to establish an in-house data quality monitoring system. This system continuously assesses data quality, providing confidence in the reliability of the organization’s data assets.
Data governance tooling – Bluestone chose Atlan, available through AWS Marketplace, to implement comprehensive data governance tooling. This SaaS service played a pivotal role in onboarding multiple business teams and fostering a data-centric culture within Bluestone. It empowered teams to efficiently manage and govern data assets.
Orchestration using Amazon MWAA – Bluestone heavily relied on Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to manage workflow orchestrations efficiently. This orchestration framework seamlessly integrated with various data quality rules, which were evaluated using Great Expectations operators within the Airflow environment.
AWS DMS – Bluestone used AWS Database Migration Service (AWS DMS) to streamline the consolidation of legacy data into the data platform. This service facilitated the smooth transfer of data from legacy SQL Server warehouses to the data lake and data warehouse, providing data continuity and accessibility.
AWS Glue – Bluestone used the AWS Glue PySpark environment for implementing data extract, transform, and load (ETL) processes. It played a pivotal role in processing data originating from various source systems, providing data consistency and suitability for analytical use.
AWS Glue Data Catalog – Bluestone centralized their data management using the AWS Glue Data Catalog. This catalog served as the backbone for managing data assets within the Bluestone data estate, enhancing data discoverability and accessibility.
AWS CloudTrail – Bluestone implemented AWS CloudTrail to monitor and audit platform activities rigorously. This security-focused service provided essential visibility into platform actions, providing compliance and security in data operations.

AWS’s comprehensive suite of services has been integral in propelling the Bluestone Data Platform towards data-driven success. These services have not only enabled efficient data governance, quality assurance, and orchestration, but have also fostered a culture of data centricity within the organization, ultimately leading to better decision-making and competitive advantage. Bluestone’s journey showcases the power of AWS in transforming organizations into data-driven leaders in their respective industries.

Bluestone data architecture

Bluestone’s data architecture has undergone a dynamic transformation, transitioning from a lake house framework to a data mesh architecture. This evolution was driven by the organization’s need for data products with distributed ownership and the necessity for a centralized mechanism to govern and access these data products across various business units.

The following diagram illustrates the solution architecture and its use of AWS and third-party services.

Let’s delve deeper into how this architecture shift has unfolded and what it entails:

The need for change – The catalyst for this transformation was the growing demand for discrete data products tailored to the unique requirements of each business unit within Bluestone. Because these business units generated their own data assets in their respective domains, the challenge lay in efficiently managing, governing, and accessing these diverse data stores. Bluestone recognized the need for a more structured and scalable approach.
Data products with distributed ownership – In response to this demand, Bluestone adopted a data mesh architecture, which allowed for the creation of distinct data products aligned with each business unit’s needs. Each of these data products exists independently, generating and curating data assets specific to its domain. These data products serve as individual data hubs, ensuring data autonomy and specialization.
Centralized catalog integration – To streamline the discovery and accessibility of the data assets that are dispersed across these data products, Bluestone introduced a centralized catalog. This catalog acts as a unified repository where all data products register their respective data assets. It serves as a critical component for data discovery and management.
Data governance tool integration – Ensuring data governance and lineage tracking across the organization was another pivotal consideration. Bluestone implemented a robust data governance tool that connects to the centralized catalog. This integration makes sure that the overarching lineage of data assets is comprehensively mapped and captured. Data governance processes are thereby enforced consistently, guaranteeing data quality and compliance.
Amazon Redshift data sharing for control and access – To facilitate controlled and secure access to data assets residing within individual data product Redshift instances, Bluestone used Amazon Redshift data sharing. This capability allows data assets to be exposed and shared selectively, providing granular control over access while maintaining data security and integrity.

In essence, Bluestone’s journey from a lake house to a data mesh architecture represents a strategic shift in data management and governance. This transformation empowers different business units to operate autonomously within their data domains while ensuring centralized control, governance, and accessibility. The integration of a centralized catalog and data governance tooling, coupled with the flexibility of Amazon Redshift data sharing, creates a harmonious ecosystem where data-driven decision-making thrives, ultimately contributing to Bluestone’s success in the ever-evolving financial landscape.

Conclusion

Bluestone’s journey from a legacy SQL-based system to a modern data mesh architecture on AWS has improved the way the organization interacts with data and positioned them as a data-driven powerhouse in the financial industry. By embracing AWS services, Bluestone has successfully achieved a centralized, scalable, and governable data platform that empowers its teams to make informed decisions, drive innovation, and stay ahead in the competitive landscape. This transformation serves as compelling proof that Amazon Redshift and AWS Cloud data sharing capabilities are a great pathway for organizations looking to embark on their own data-driven journeys with AWS.

About the Authors

Toney Thomas is a Data Architect and Data Engineering Lead at Bluestone, renowned for his role in envisioning and coining the company’s pioneering data strategy. With a strategic focus on harnessing the power of advanced technology to tackle intricate business challenges, Toney leads a dynamic team of Data Engineers, Reporting Engineers, Quality Assurance specialists, and Business Analysts at Bluestone. His leadership extends to driving the implementation of robust data governance frameworks across diverse organizational units. Under his guidance, Bluestone has achieved remarkable success, including the deployment of innovative platforms such as a fully governed data mesh business data system with embedded data quality mechanisms, aligning seamlessly with the organization’s commitment to data democratization and excellence.

Ben Vengerovsky is a Data Platform Product Manager at Bluestone. He is passionate about using cloud technology to revolutionize the company’s data infrastructure. With a background in mortgage lending and a deep understanding of AWS services, Ben specializes in designing scalable and efficient data solutions that drive business growth and enhance customer experiences. He thrives on collaborating with cross-functional teams to translate business requirements into innovative technical solutions that empower data-driven decision-making.

Rada Stanic is a Chief Technologist at Amazon Web Services, where she helps ANZ customers across different segments solve their business problems using AWS Cloud technologies. Her special areas of interest are data analytics, machine learning/AI, and application modernization.

Simplify data streaming ingestion for analytics using Amazon MSK and Amazon Redshift

2024-02-21 Sebastian Vlad

Post Syndicated from Sebastian Vlad original https://aws.amazon.com/blogs/big-data/simplify-data-streaming-ingestion-for-analytics-using-amazon-msk-and-amazon-redshift/

Towards the end of 2022, AWS announced the general availability of real-time streaming ingestion to Amazon Redshift for Amazon Kinesis Data Streams and Amazon Managed Streaming for Apache Kafka (Amazon MSK), eliminating the need to stage streaming data in Amazon Simple Storage Service (Amazon S3) before ingesting it into Amazon Redshift.

Streaming ingestion from Amazon MSK into Amazon Redshift, represents a cutting-edge approach to real-time data processing and analysis. Amazon MSK serves as a highly scalable, and fully managed service for Apache Kafka, allowing for seamless collection and processing of vast streams of data. Integrating streaming data into Amazon Redshift brings immense value by enabling organizations to harness the potential of real-time analytics and data-driven decision-making.

This integration enables you to achieve low latency, measured in seconds, while ingesting hundreds of megabytes of streaming data per second into Amazon Redshift. At the same time, this integration helps make sure that the most up-to-date information is readily available for analysis. Because the integration doesn’t require staging data in Amazon S3, Amazon Redshift can ingest streaming data at a lower latency and without intermediary storage cost.

You can configure Amazon Redshift streaming ingestion on a Redshift cluster using SQL statements to authenticate and connect to an MSK topic. This solution is an excellent option for data engineers that are looking to simplify data pipelines and reduce the operational cost.

In this post, we provide a complete overview on how to configure Amazon Redshift streaming ingestion from Amazon MSK.

Solution overview

The following architecture diagram describes the AWS services and features you will be using.

architecture diagram describing the AWS services and features you will be using

The workflow includes the following steps:

You start with configuring an Amazon MSK Connect source connector, to create an MSK topic, generate mock data, and write it to the MSK topic. For this post, we work with mock customer data.
The next step is to connect to a Redshift cluster using the Query Editor v2.
Finally, you configure an external schema and create a materialized view in Amazon Redshift, to consume the data from the MSK topic. This solution does not rely on an MSK Connect sink connector to export the data from Amazon MSK to Amazon Redshift.

The following solution architecture diagram describes in more detail the configuration and integration of the AWS services you will be using.

The workflow includes the following steps:

You deploy an MSK Connect source connector, an MSK cluster, and a Redshift cluster within the private subnets on a VPC.
The MSK Connect source connector uses granular permissions defined in an AWS Identity and Access Management (IAM) in-line policy attached to an IAM role, which allows the source connector to perform actions on the MSK cluster.
The MSK Connect source connector logs are captured and sent to an Amazon CloudWatch log group.
The MSK cluster uses a custom MSK cluster configuration, allowing the MSK Connect connector to create topics on the MSK cluster.
The MSK cluster logs are captured and sent to an Amazon CloudWatch log group.
The Redshift cluster uses granular permissions defined in an IAM in-line policy attached to an IAM role, which allows the Redshift cluster to perform actions on the MSK cluster.
You can use the Query Editor v2 to connect to the Redshift cluster.

Prerequisites

To simplify the provisioning and configuration of the prerequisite resources, you can use the following AWS CloudFormation template:

Complete the following steps when launching the stack:

For Stack name, enter a meaningful name for the stack, for example, prerequisites.
Choose Next.
Choose Next.
Select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
Choose Submit.

The CloudFormation stack creates the following resources:

A VPC custom-vpc, created across three Availability Zones, with three public subnets and three private subnets:
- The public subnets are associated with a public route table, and outbound traffic is directed to an internet gateway.
- The private subnets are associated with a private route table, and outbound traffic is sent to a NAT gateway.
An internet gateway attached to the Amazon VPC.
A NAT gateway that is associated with an elastic IP and is deployed in one of the public subnets.
Three security groups:
- msk-connect-sg, which will be later associated with the MSK Connect connector.
- redshift-sg, which will be later associated with the Redshift cluster.
- msk-cluster-sg, which will be later associated with the MSK cluster. It allows inbound traffic from msk-connect-sg, and redshift-sg.
Two CloudWatch log groups:
- msk-connect-logs, to be used for the MSK Connect logs.
- msk-cluster-logs, to be used for the MSK cluster logs.
Two IAM Roles:
- msk-connect-role, which includes granular IAM permissions for MSK Connect.
- redshift-role, which includes granular IAM permissions for Amazon Redshift.
A custom MSK cluster configuration, allowing the MSK Connect connector to create topics on the MSK cluster.
An MSK cluster, with three brokers deployed across the three private subnets of custom-vpc. The msk-cluster-sg security group and the custom-msk-cluster-configuration configuration are applied to the MSK cluster. The broker logs are delivered to the msk-cluster-logs CloudWatch log group.
A Redshift cluster subnet group, which is using the three private subnets of custom-vpc.
A Redshift cluster, with one single node deployed in a private subnet within the Redshift cluster subnet group. The redshift-sg security group and redshift-role IAM role are applied to the Redshift cluster.

Create an MSK Connect custom plugin

For this post, we use an Amazon MSK data generator deployed in MSK Connect, to generate mock customer data, and write it to an MSK topic.

Complete the following steps:

Download the Amazon MSK data generator JAR file with dependencies from GitHub.
Upload the JAR file into an S3 bucket in your AWS account.
On the Amazon MSK console, choose Custom plugins under MSK Connect in the navigation pane.
Choose Create custom plugin.
Choose Browse S3, search for the Amazon MSK data generator JAR file you uploaded to Amazon S3, then choose Choose.
For Custom plugin name, enter msk-datagen-plugin.
Choose Create custom plugin.

When the custom plugin is created, you will see that its status is Active, and you can move to the next step.
amazon msk console showing the msk connect custom plugin being successfully created

Create an MSK Connect connector

Complete the following steps to create your connector:

On the Amazon MSK console, choose Connectors under MSK Connect in the navigation pane.
Choose Create connector.
For Custom plugin type, choose Use existing plugin.
Select msk-datagen-plugin, then choose Next.
For Connector name, enter msk-datagen-connector.
For Cluster type, choose Self-managed Apache Kafka cluster.
For VPC, choose custom-vpc.
For Subnet 1, choose the private subnet within your first Availability Zone.

For the custom-vpc created by the CloudFormation template, we are using odd CIDR ranges for public subnets, and even CIDR ranges for the private subnets:

- The CIDRs for the public subnets are 10.10.1.0/24, 10.10.3.0/24, and 10.10.5.0/24
- The CIDRs for the private subnets are 10.10.2.0/24, 10.10.4.0/24, and 10.10.6.0/24

For Subnet 2, select the private subnet within your second Availability Zone.
For Subnet 3, select the private subnet within your third Availability Zone.
For Bootstrap servers, enter the list of bootstrap servers for TLS authentication of your MSK cluster.

To retrieve the bootstrap servers for your MSK cluster, navigate to the Amazon MSK console, choose Clusters, choose msk-cluster, then choose View client information. Copy the TLS values for the bootstrap servers.

For Security groups, choose Use specific security groups with access to this cluster, and choose msk-connect-sg.
For Connector configuration, replace the default settings with the following:

connector.class=com.amazonaws.mskdatagen.GeneratorSourceConnector
tasks.max=2
genkp.customer.with=#{Code.isbn10}
genv.customer.name.with=#{Name.full_name}
genv.customer.gender.with=#{Demographic.sex}
genv.customer.favorite_beer.with=#{Beer.name}
genv.customer.state.with=#{Address.state}
genkp.order.with=#{Code.isbn10}
genv.order.product_id.with=#{number.number_between '101','109'}
genv.order.quantity.with=#{number.number_between '1','5'}
genv.order.customer_id.matching=customer.key
global.throttle.ms=2000
global.history.records.max=1000
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false

For Connector capacity, choose Provisioned.
For MCU count per worker, choose 1.
For Number of workers, choose 1.
For Worker configuration, choose Use the MSK default configuration.
For Access permissions, choose msk-connect-role.
Choose Next.
For Encryption, select TLS encrypted traffic.
Choose Next.
For Log delivery, choose Deliver to Amazon CloudWatch Logs.
Choose Browse, select msk-connect-logs, and choose Choose.
Choose Next.
Review and choose Create connector.

After the custom connector is created, you will see that its status is Running, and you can move to the next step.
amazon msk console showing the msk connect connector being successfully created

Configure Amazon Redshift streaming ingestion for Amazon MSK

Complete the following steps to set up streaming ingestion:

Connect to your Redshift cluster using Query Editor v2, and authenticate with the database user name awsuser, and password Awsuser123.
Create an external schema from Amazon MSK using the following SQL statement.

In the following code, enter the values for the redshift-role IAM role, and the msk-cluster cluster ARN.

CREATE EXTERNAL SCHEMA msk_external_schema
FROM MSK
IAM_ROLE '<insert your redshift-role arn>'
AUTHENTICATION iam
CLUSTER_ARN '<insert your msk-cluster arn>';

Choose Run to run the SQL statement.

redshift query editor v2 showing the SQL statement used to create an external schema from amazon msk

Create a materialized view using the following SQL statement:

CREATE MATERIALIZED VIEW msk_mview AUTO REFRESH YES AS
SELECT
    "kafka_partition",
    "kafka_offset",
    "kafka_timestamp_type",
    "kafka_timestamp",
    "kafka_key",
    JSON_PARSE(kafka_value) as Data,
    "kafka_headers"
FROM
    "dev"."msk_external_schema"."customer"

Choose Run to run the SQL statement.

redshift query editor v2 showing the SQL statement used to create a materialized view

You can now query the materialized view using the following SQL statement:

select * from msk_mview LIMIT 100;

Choose Run to run the SQL statement.

redshift query editor v2 showing the SQL statement used to query the materialized view

To monitor the progress of records loaded via streaming ingestion, you can take advantage of the SYS_STREAM_SCAN_STATES monitoring view using the following SQL statement:

select * from SYS_STREAM_SCAN_STATES;

Choose Run to run the SQL statement.

redshift query editor v2 showing the SQL statement used to query the sys stream scan states monitoring view

To monitor errors encountered on records loaded via streaming ingestion, you can take advantage of the SYS_STREAM_SCAN_ERRORS monitoring view using the following SQL statement:

select * from SYS_STREAM_SCAN_ERRORS;

Choose Run to run the SQL statement.

Clean up

After following along, if you no longer need the resources you created, delete them in the following order to prevent incurring additional charges:

Delete the MSK Connect connector msk-datagen-connector.
Delete the MSK Connect plugin msk-datagen-plugin.
Delete the Amazon MSK data generator JAR file you downloaded, and delete the S3 bucket you created.
After you delete your MSK Connect connector, you can delete the CloudFormation template. All the resources created by the CloudFormation template will be automatically deleted from your AWS account.

Conclusion

In this post, we demonstrated how to configure Amazon Redshift streaming ingestion from Amazon MSK, with a focus on privacy and security.

The combination of the ability of Amazon MSK to handle high throughput data streams with the robust analytical capabilities of Amazon Redshift empowers business to derive actionable insights promptly. This real-time data integration enhances the agility and responsiveness of organizations in understanding changing data trends, customer behaviors, and operational patterns. It allows for timely and informed decision-making, thereby gaining a competitive edge in today’s dynamic business landscape.

This solution is also applicable for customers that are looking to use Amazon MSK Serverless and Amazon Redshift Serverless.

We hope this post was a good opportunity to learn more about AWS service integration and configuration. Let us know your feedback in the comments section.

About the authors

Sebastian Vlad is a Senior Partner Solutions Architect with Amazon Web Services, with a passion for data and analytics solutions and customer success. Sebastian works with enterprise customers to help them design and build modern, secure, and scalable solutions to achieve their business outcomes.

Sharad Pai is a Lead Technical Consultant at AWS. He specializes in streaming analytics and helps customers build scalable solutions using Amazon MSK and Amazon Kinesis. He has over 16 years of industry experience and is currently working with media customers who are hosting live streaming platforms on AWS, managing peak concurrency of over 50 million. Prior to joining AWS, Sharad’s career as a lead software developer included 9 years of coding, working with open source technologies like JavaScript, Python, and PHP.

Combine transactional, streaming, and third-party data on Amazon Redshift for financial services

2024-02-01 Satesh Sonti

Post Syndicated from Satesh Sonti original https://aws.amazon.com/blogs/big-data/combine-transactional-streaming-and-third-party-data-on-amazon-redshift-for-financial-services/

Financial services customers are using data from different sources that originate at different frequencies, which includes real time, batch, and archived datasets. Additionally, they need streaming architectures to handle growing trade volumes, market volatility, and regulatory demands. The following are some of the key business use cases that highlight this need:

Trade reporting – Since the global financial crisis of 2007–2008, regulators have increased their demands and scrutiny on regulatory reporting. Regulators have placed an increased focus to both protect the consumer through transaction reporting (typically T+1, meaning 1 business day after the trade date) and increase transparency into markets via near-real-time trade reporting requirements.
Risk management – As capital markets become more complex and regulators launch new risk frameworks, such as Fundamental Review of the Trading Book (FRTB) and Basel III, financial institutions are looking to increase the frequency of calculations for overall market risk, liquidity risk, counter-party risk, and other risk measurements, and want to get as close to real-time calculations as possible.
Trade quality and optimization – In order to monitor and optimize trade quality, you need to continually evaluate market characteristics such as volume, direction, market depth, fill rate, and other benchmarks related to the completion of trades. Trade quality is not only related to broker performance, but is also a requirement from regulators, starting with MIFID II.

The challenge is to come up with a solution that can handle these disparate sources, varied frequencies, and low-latency consumption requirements. The solution should be scalable, cost-efficient, and straightforward to adopt and operate. Amazon Redshift features like streaming ingestion, Amazon Aurora zero-ETL integration, and data sharing with AWS Data Exchange enable near-real-time processing for trade reporting, risk management, and trade optimization.

In this post, we provide a solution architecture that describes how you can process data from three different types of sources—streaming, transactional, and third-party reference data—and aggregate them in Amazon Redshift for business intelligence (BI) reporting.

Solution overview

This solution architecture is created prioritizing a low-code/no-code approach with the following guiding principles:

Ease of use – It should be less complex to implement and operate with intuitive user interfaces
Scalable – You should be able to seamlessly increase and decrease capacity on demand
Native integration – Components should integrate without additional connectors or software
Cost-efficient – It should deliver balanced price/performance
Low maintenance – It should require less management and operational overhead

The following diagram illustrates the solution architecture and how these guiding principles were applied to the ingestion, aggregation, and reporting components.

Deploy the solution

You can use the following AWS CloudFormation template to deploy the solution.

This stack creates the following resources and necessary permissions to integrate the services:

Data stream – With Amazon Kinesis Data Streams, you can send data from your streaming source to a data stream to ingest the data into a Redshift data warehouse. Refer to Real-time analytics with Amazon Redshift streaming ingestion for information about configuring streaming ingestion.
Database cluster – For this solution, we use an Amazon Aurora MySQL-Compatible Edition 8.0 version cluster. This will be your OLTP data store for transactional data. You will set up zero-ETL integration for ingesting transaction data to the Redshift data warehouse. Refer to Getting started guide for near-real time operational analytics using Amazon Aurora zero-ETL integration with Amazon Redshift for instructions on creating the integration. The required parameter groups for source and target are already created as part of the CloudFormation stack.
Redshift cluster – This stack creates an Amazon Redshift Serverless workgroup and associated namespace. It also deploys a provisioned Redshift cluster. If you would like to work with Redshift Serverless, you can remove the provisioned cluster from the template and vice versa.
IAM role – The stack creates an AWS Identity and Access Management (IAM) role with required policies and trust relationships.
Network components – This includes a VPC, subnets, route table, and associations. You can customize these resources as per your organization’s rules.

Ingestion

To ingest data, you use Amazon Redshift Streaming Ingestion to load streaming data from the Kinesis data stream. For transactional data, you use the Redshift zero-ETL integration with Amazon Aurora MySQL. For third-party reference data, you take advantage of AWS Data Exchange data shares. These capabilities allow you to quickly build scalable data pipelines because you can increase the capacity of Kinesis Data Streams shards, compute for zero-ETL sources and targets, and Redshift compute for data shares when your data grows. Redshift streaming ingestion and zero-ETL integration are low-code/no-code solutions that you can build with simple SQLs without investing significant time and money into developing complex custom code.

For the data used to create this solution, we partnered with FactSet, a leading financial data, analytics, and open technology provider. FactSet has several datasets available in the AWS Data Exchange marketplace, which we used for reference data. We also used FactSet’s market data solutions for historical and streaming market quotes and trades.

Processing

Data is processed in Amazon Redshift adhering to an extract, load, and transform (ELT) methodology. With virtually unlimited scale and workload isolation, ELT is more suited for cloud data warehouse solutions.

You use Redshift streaming ingestion for real-time ingestion of streaming quotes (bid/ask) from the Kinesis data stream directly into a streaming materialized view and process the data in the next step using PartiQL for parsing the data stream inputs. Note that streaming materialized views differs from regular materialized views in terms of how auto refresh works and the data management SQL commands used. Refer to Streaming ingestion considerations for details.

You use the zero-ETL Aurora integration for ingesting transactional data (trades) from OLTP sources. Refer to Working with zero-ETL integrations for currently supported sources. You can combine data from all these sources using views, and use stored procedures to implement business transformation rules like calculating weighted averages across sectors and exchanges.

Historical trade and quote data volumes are huge and often not queried frequently. You can use Amazon Redshift Spectrum to access this data in place without loading it into Amazon Redshift. You create external tables pointing to data in Amazon Simple Storage Service (Amazon S3) and query similarly to how you query any other local table in Amazon Redshift. Multiple Redshift data warehouses can concurrently query the same datasets in Amazon S3 without the need to make copies of the data for each data warehouse. This feature simplifies accessing external data without writing complex ETL processes and enhances the ease of use of the overall solution.

Let’s review a few sample queries used for analyzing quotes and trades. We use the following tables in the sample queries:

dt_hist_quote – Historical quotes data containing bid price and volume, ask price and volume, and exchanges and sectors. You should use relevant datasets in your organization that contain these data attributes.
dt_hist_trades – Historical trades data containing traded price, volume, sector, and exchange details. You should use relevant datasets in your organization that contain these data attributes.
factset_sector_map – Mapping between sectors and exchanges. You can obtain this from the FactSet Fundamentals ADX dataset.

Sample query for analyzing historical quotes

You can use the following query to find weighted average spreads on quotes:

select
date_dt :: date,
case
when exchange_name like 'Cboe%' then 'CBOE'
when (exchange_name) like 'NYSE%' then 'NYSE'
when (exchange_name) like 'New York Stock Exchange' then 'NYSE'
when (exchange_name) like 'Nasdaq%' then 'NASDAQ'
end as parent_exchange_name,
sector_name,
sum(spread * weight)/sum(weight) :: decimal (30,5) as weighted_average_spread
from
(
select date_dt,exchange_name,
factset_sector_desc sector_name,
((bid_price*bid_volume) + (ask_price*ask_volume))as weight,
((ask_price - bid_price)/ask_price) as spread
from
dt_hist_quotes a
join
fds_adx_fundamentals_db.ref_v2.factset_sector_map b
on(a.sector_code = b.factset_sector_code)
where ask_price <> 0 and bid_price <> 0
)
group by 1,2,3

Sample query for analyzing historical trades

You can use the following query to find $-volume on trades by detailed exchange, by sector, and by major exchange (NYSE and Nasdaq):

select
cast(date_dt as date) as date_dt,
case
when exchange_name like 'Cboe%' then 'CBOE'
when (exchange_name) like 'NYSE%' then 'NYSE'
when (exchange_name) like 'New York Stock Exchange' then 'NYSE'
when (exchange_name) like 'Nasdaq%' then 'NASDAQ'
end as parent_exchange_name,
factset_sector_desc sector_name,
sum((price * volume):: decimal(30,4)) total_transaction_amt
from
dt_hist_trades a
join
fds_adx_fundamentals_db.ref_v2.factset_sector_map b
on(a.sector_code = b.factset_sector_code)
group by 1,2,3

Reporting

You can use Amazon QuickSight and Amazon Managed Grafana for BI and real-time reporting, respectively. These services natively integrate with Amazon Redshift without the need to use additional connectors or software in between.

You can run a direct query from QuickSight for BI reporting and dashboards. With QuickSight, you can also locally store data in the SPICE cache with auto refresh for low latency. Refer to Authorizing connections from Amazon QuickSight to Amazon Redshift clusters for comprehensive details on how to integrate QuickSight with Amazon Redshift.

You can use Amazon Managed Grafana for near-real-time trade dashboards that are refreshed every few seconds. The real-time dashboards for monitoring the trade ingestion latencies are created using Grafana and the data is sourced from system views in Amazon Redshift. Refer to Using the Amazon Redshift data source to learn about how to configure Amazon Redshift as a data source for Grafana.

The users who interact with regulatory reporting systems include analysts, risk managers, operators, and other personas that support business and technology operations. Apart from generating regulatory reports, these teams require visibility into the health of the reporting systems.

Historical quotes analysis

In this section, we explore some examples of historical quotes analysis from the Amazon QuickSight dashboard.

Weighted average spread by sectors

The following chart shows the daily aggregation by sector of the weighted average bid-ask spreads of all the individual trades on NASDAQ and NYSE for 3 months. To calculate the average daily spread, each spread is weighted by the sum of the bid and the ask dollar volume. The query to generate this chart processes 103 billion of data points in total, joins each trade with the sector reference table, and runs in less than 10 seconds.

Weighted average spread by exchanges

The following chart shows the daily aggregation of the weighted average bid-ask spreads of all the individual trades on NASDAQ and NYSE for 3 months. The calculation methodology and query performance metrics are similar to those of the preceding chart.

Historical trades analysis

In this section, we explore some examples of historical trades analysis from the Amazon QuickSight dashboard.

Trade volumes by sector

The following chart shows the daily aggregation by sector of all the individual trades on NASDAQ and NYSE for 3 months. The query to generate this chart processes 3.6 billion of trades in total, joins each trade with the sector reference table, and runs in under 5 seconds.

Trade volumes for major exchanges

The following chart shows the daily aggregation by exchange group of all the individual trades for 3 months. The query to generate this chart has similar performance metrics as the preceding chart.

Real-time dashboards

Monitoring and observability is an important requirement for any critical business application such as trade reporting, risk management, and trade management systems. Apart from system-level metrics, it’s also important to monitor key performance indicators in real time so that operators can be alerted and respond as soon as possible to business-impacting events. For this demonstration, we have built dashboards in Grafana that monitor the delay of quote and trade data from the Kinesis data stream and Aurora, respectively.

The quote ingestion delay dashboard shows the amount of time it takes for each quote record to be ingested from the data stream and be available for querying in Amazon Redshift.

The trade ingestion delay dashboard shows the amount of time it takes for a transaction in Aurora to become available in Amazon Redshift for querying.

Clean up

To clean up your resources, delete the stack you deployed using AWS CloudFormation. For instructions, refer to Deleting a stack on the AWS CloudFormation console.

Conclusion

Increasing volumes of trading activity, more complex risk management, and enhanced regulatory requirements are leading capital markets firms to embrace real-time and near-real-time data processing, even in mid- and back-office platforms where end of day and overnight processing was the standard. In this post, we demonstrated how you can use Amazon Redshift capabilities for ease of use, low maintenance, and cost-efficiency. We also discussed cross-service integrations to ingest streaming market data, process updates from OLTP databases, and use third-party reference data without having to perform complex and expensive ETL or ELT processing before making the data available for analysis and reporting.

Please reach out to us if you need any guidance in implementing this solution. Refer to Real-time analytics with Amazon Redshift streaming ingestion, Getting started guide for near-real time operational analytics using Amazon Aurora zero-ETL integration with Amazon Redshift, and Working with AWS Data Exchange data shares as a producer for more information.

About the Authors

Satesh Sonti is a Sr. Analytics Specialist Solutions Architect based out of Atlanta, specialized in building enterprise data platforms, data warehousing, and analytics solutions. He has over 18 years of experience in building data assets and leading complex data platform programs for banking and insurance clients across the globe.

Alket Memushaj works as a Principal Architect in the Financial Services Market Development team at AWS. Alket is responsible for technical strategy for capital markets, working with partners and customers to deploy applications across the trade lifecycle to the AWS Cloud, including market connectivity, trading systems, and pre- and post-trade analytics and research platforms.

Ruben Falk is a Capital Markets Specialist focused on AI and data & analytics. Ruben consults with capital markets participants on modern data architecture and systematic investment processes. He joined AWS from S&P Global Market Intelligence where he was Global Head of Investment Management Solutions.

Jeff Wilson is a World-wide Go-to-market Specialist with 15 years of experience working with analytic platforms. His current focus is sharing the benefits of using Amazon Redshift, Amazon’s native cloud data warehouse. Jeff is based in Florida and has been with AWS since 2019.

Mastering market dynamics: Transforming transaction cost analytics with ultra-precise Tick History – PCAP and Amazon Athena for Apache Spark

2024-01-31 Pramod Nayak

Post Syndicated from Pramod Nayak original https://aws.amazon.com/blogs/big-data/mastering-market-dynamics-transforming-transaction-cost-analytics-with-ultra-precise-tick-history-pcap-and-amazon-athena-for-apache-spark/

This post is cowritten with Pramod Nayak, LakshmiKanth Mannem and Vivek Aggarwal from the Low Latency Group of LSEG.

Transaction cost analysis (TCA) is widely used by traders, portfolio managers, and brokers for pre-trade and post-trade analysis, and helps them measure and optimize transaction costs and the effectiveness of their trading strategies. In this post, we analyze options bid-ask spreads from the LSEG Tick History – PCAP dataset using Amazon Athena for Apache Spark. We show you how to access data, define custom functions to apply on data, query and filter the dataset, and visualize the results of the analysis, all without having to worry about setting up infrastructure or configuring Spark, even for large datasets.

Background

Options Price Reporting Authority (OPRA) serves as a crucial securities information processor, collecting, consolidating, and disseminating last sale reports, quotes, and pertinent information for US Options. With 18 active US Options exchanges and over 1.5 million eligible contracts, OPRA plays a pivotal role in providing comprehensive market data.

On February 5, 2024, the Securities Industry Automation Corporation (SIAC) is set to upgrade the OPRA feed from 48 to 96 multicast channels. This enhancement aims to optimize symbol distribution and line capacity utilization in response to escalating trading activity and volatility in the US options market. SIAC has recommended that firms prepare for peak data rates of up to 37.3 GBits per second.

Despite the upgrade not immediately altering the total volume of published data, it enables OPRA to disseminate data at a significantly faster rate. This transition is crucial for addressing the demands of the dynamic options market.

OPRA stands out as one the most voluminous feeds, with a peak of 150.4 billion messages in a single day in Q3 2023 and a capacity headroom requirement of 400 billion messages over a single day. Capturing every single message is critical for transaction cost analytics, market liquidity monitoring, trading strategy evaluation, and market research.

About the data

LSEG Tick History – PCAP is a cloud-based repository, exceeding 30 PB, housing ultra-high-quality global market data. This data is meticulously captured directly within the exchange data centers, employing redundant capture processes strategically positioned in major primary and backup exchange data centers worldwide. LSEG’s capture technology ensures lossless data capture and uses a GPS time-source for nanosecond timestamp precision. Additionally, sophisticated data arbitrage techniques are employed to seamlessly fill any data gaps. Subsequent to capture, the data undergoes meticulous processing and arbitration, and is then normalized into Parquet format using LSEG’s Real Time Ultra Direct (RTUD) feed handlers.

The normalization process, which is integral to preparing the data for analysis, generates up to 6 TB of compressed Parquet files per day. The massive volume of data is attributed to the encompassing nature of OPRA, spanning multiple exchanges, and featuring numerous options contracts characterized by diverse attributes. Increased market volatility and market making activity on the options exchanges further contribute to the volume of data published on OPRA.

The attributes of Tick History – PCAP enable firms to conduct various analyses, including the following:

Pre-trade analysis – Evaluate potential trade impact and explore different execution strategies based on historical data
Post-trade evaluation – Measure actual execution costs against benchmarks to assess the performance of execution strategies
Optimized execution – Fine-tune execution strategies based on historical market patterns to minimize market impact and reduce overall trading costs
Risk management – Identify slippage patterns, identify outliers, and proactively manage risks associated with trading activities
Performance attribution – Separate the impact of trading decisions from investment decisions when analyzing portfolio performance

The LSEG Tick History – PCAP dataset is available in AWS Data Exchange and can be accessed on AWS Marketplace. With AWS Data Exchange for Amazon S3, you can access PCAP data directly from LSEG’s Amazon Simple Storage Service (Amazon S3) buckets, eliminating the need for firms to store their own copy of the data. This approach streamlines data management and storage, providing clients immediate access to high-quality PCAP or normalized data with ease of use, integration, and substantial data storage savings.

Athena for Apache Spark

For analytical endeavors, Athena for Apache Spark offers a simplified notebook experience accessible through the Athena console or Athena APIs, allowing you to build interactive Apache Spark applications. With an optimized Spark runtime, Athena helps the analysis of petabytes of data by dynamically scaling the number of Spark engines is less than a second. Moreover, common Python libraries such as pandas and NumPy are seamlessly integrated, allowing for the creation of intricate application logic. The flexibility extends to the importation of custom libraries for use in notebooks. Athena for Spark accommodates most open-data formats and is seamlessly integrated with the AWS Glue Data Catalog.

Dataset

For this analysis, we used the LSEG Tick History – PCAP OPRA dataset from May 17, 2023. This dataset comprises the following components:

Best bid and offer (BBO) – Reports the highest bid and lowest ask for a security at a given exchange
National best bid and offer (NBBO) – Reports the highest bid and lowest ask for a security across all exchanges
Trades – Records completed trades across all exchanges

The dataset involves the following data volumes:

Trades – 160 MB distributed across approximately 60 compressed Parquet files
BBO – 2.4 TB distributed across approximately 300 compressed Parquet files
NBBO – 2.8 TB distributed across approximately 200 compressed Parquet files

Analysis overview

Analyzing OPRA Tick History data for Transaction Cost Analysis (TCA) involves scrutinizing market quotes and trades around a specific trade event. We use the following metrics as part of this study:

Quoted spread (QS) – Calculated as the difference between the BBO ask and the BBO bid
Effective spread (ES) – Calculated as the difference between the trade price and the midpoint of the BBO (BBO bid + (BBO ask – BBO bid)/2)
Effective/quoted spread (EQF) – Calculated as (ES / QS) * 100

We calculate these spreads before the trade and additionally at four intervals after the trade (just after, 1 second, 10 seconds, and 60 seconds after the trade).

Configure Athena for Apache Spark

To configure Athena for Apache Spark, complete the following steps:

On the Athena console, under Get started, select Analyze your data using PySpark and Spark SQL.
If this is your first time using Athena Spark, choose Create workgroup.
For Workgroup name¸ enter a name for the workgroup, such as tca-analysis.
In the Analytics engine section, select Apache Spark.
In the Additional configurations section, you can choose Use defaults or provide a custom AWS Identity and Access Management (IAM) role and Amazon S3 location for calculation results.
Choose Create workgroup.
After you create the workgroup, navigate to the Notebooks tab and choose Create notebook.
Enter a name for your notebook, such as tca-analysis-with-tick-history.
Choose Create to create your notebook.

Launch your notebook

If you have already created a Spark workgroup, select Launch notebook editor under Get started.

After your notebook is created, you will be redirected to the interactive notebook editor.

Now we can add and run the following code to our notebook.

Create an analysis

Complete the following steps to create an analysis:

Import common libraries:

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

Create our data frames for BBO, NBBO, and trades:

bbo_quote = spark.read.parquet(f"s3://<bucket>/mt=bbo_quote/f=opra/dt=2023-05-17/*")
bbo_quote.createOrReplaceTempView("bbo_quote")
nbbo_quote = spark.read.parquet(f"s3://<bucket>/mt=nbbo_quote/f=opra/dt=2023-05-17/*")
nbbo_quote.createOrReplaceTempView("nbbo_quote")
trades = spark.read.parquet(f"s3://<bucket>/mt=trade/f=opra/dt=2023-05-17/29_1.parquet")
trades.createOrReplaceTempView("trades")

Now we can identify a trade to use for transaction cost analysis:

filtered_trades = spark.sql("select Product, Price,Quantity, ReceiptTimestamp, MarketParticipant from trades")

We get the following output:

+---------------------+---------------------+---------------------+-------------------+-----------------+ 
|Product |Price |Quantity |ReceiptTimestamp |MarketParticipant| 
+---------------------+---------------------+---------------------+-------------------+-----------------+ 
|QQQ 230518C00329000|1.1700000000000000000|10.0000000000000000000|1684338565538021907,NYSEArca|
|QQQ 230518C00329000|1.1700000000000000000|20.0000000000000000000|1684338576071397557,NASDAQOMXPHLX|
|QQQ 230518C00329000|1.1600000000000000000|1.0000000000000000000|1684338579104713924,ISE|
|QQQ 230518C00329000|1.1400000000000000000|1.0000000000000000000|1684338580263307057,NASDAQOMXBX_Options|
|QQQ 230518C00329000|1.1200000000000000000|1.0000000000000000000|1684338581025332599,ISE|
+---------------------+---------------------+---------------------+-------------------+-----------------+

We use the highlighted trade information going forward for the trade product (tp), trade price (tpr), and trade time (tt).

Here we create a number of helper functions for our analysis

def calculate_es_qs_eqf(df, trade_price):
    df['BidPrice'] = df['BidPrice'].astype('double')
    df['AskPrice'] = df['AskPrice'].astype('double')
    df["ES"] = ((df["AskPrice"]-df["BidPrice"])/2) - trade_price
    df["QS"] = df["AskPrice"]-df["BidPrice"]
    df["EQF"] = (df["ES"]/df["QS"])*100
    return df

def get_trade_before_n_seconds(trade_time, df, seconds=0, groupby_col = None):
    nseconds=seconds*1000000000
    nseconds += trade_time
    ret_df = df[df['ReceiptTimestamp'] < nseconds].groupby(groupby_col).last()
    ret_df['BidPrice'] = ret_df['BidPrice'].astype('double')
    ret_df['AskPrice'] = ret_df['AskPrice'].astype('double')
    ret_df = ret_df.reset_index()
    return ret_df

def get_trade_after_n_seconds(trade_time, df, seconds=0, groupby_col = None):
    nseconds=seconds*1000000000
    nseconds += trade_time
    ret_df = df[df['ReceiptTimestamp'] > nseconds].groupby(groupby_col).first()
    ret_df['BidPrice'] = ret_df['BidPrice'].astype('double')
    ret_df['AskPrice'] = ret_df['AskPrice'].astype('double')
    ret_df = ret_df.reset_index()
    return ret_df

def get_nbbo_trade_before_n_seconds(trade_time, df, seconds=0):
    nseconds=seconds*1000000000
    nseconds += trade_time
    ret_df = df[df['ReceiptTimestamp'] < nseconds].iloc[-1:]
    ret_df['BidPrice'] = ret_df['BidPrice'].astype('double')
    ret_df['AskPrice'] = ret_df['AskPrice'].astype('double')
    return ret_df

def get_nbbo_trade_after_n_seconds(trade_time, df, seconds=0):
    nseconds=seconds*1000000000
    nseconds += trade_time
    ret_df = df[df['ReceiptTimestamp'] > nseconds].iloc[:1]
    ret_df['BidPrice'] = ret_df['BidPrice'].astype('double')
    ret_df['AskPrice'] = ret_df['AskPrice'].astype('double')
    return ret_df

In the following function, we create the dataset that contains all the quotes before and after the trade. Athena Spark automatically determines how many DPUs to launch for processing our dataset.

def get_tca_analysis_via_df_single_query(trade_product, trade_price, trade_time):
    # BBO quotes
    bbos = spark.sql(f"SELECT Product, ReceiptTimestamp, AskPrice, BidPrice, MarketParticipant FROM bbo_quote where Product = '{trade_product}';")
    bbos = bbos.toPandas()

    bbo_just_before = get_trade_before_n_seconds(trade_time, bbos, seconds=0, groupby_col='MarketParticipant')
    bbo_just_after = get_trade_after_n_seconds(trade_time, bbos, seconds=0, groupby_col='MarketParticipant')
    bbo_1s_after = get_trade_after_n_seconds(trade_time, bbos, seconds=1, groupby_col='MarketParticipant')
    bbo_10s_after = get_trade_after_n_seconds(trade_time, bbos, seconds=10, groupby_col='MarketParticipant')
    bbo_60s_after = get_trade_after_n_seconds(trade_time, bbos, seconds=60, groupby_col='MarketParticipant')
    
    all_bbos = pd.concat([bbo_just_before, bbo_just_after, bbo_1s_after, bbo_10s_after, bbo_60s_after], ignore_index=True, sort=False)
    bbos_calculated = calculate_es_qs_eqf(all_bbos, trade_price)

    #NBBO quotes
    nbbos = spark.sql(f"SELECT Product, ReceiptTimestamp, AskPrice, BidPrice, BestBidParticipant, BestAskParticipant FROM nbbo_quote where Product = '{trade_product}';")
    nbbos = nbbos.toPandas()

    nbbo_just_before = get_nbbo_trade_before_n_seconds(trade_time,nbbos, seconds=0)
    nbbo_just_after = get_nbbo_trade_after_n_seconds(trade_time, nbbos, seconds=0)
    nbbo_1s_after = get_nbbo_trade_after_n_seconds(trade_time, nbbos, seconds=1)
    nbbo_10s_after = get_nbbo_trade_after_n_seconds(trade_time, nbbos, seconds=10)
    nbbo_60s_after = get_nbbo_trade_after_n_seconds(trade_time, nbbos, seconds=60)

    all_nbbos = pd.concat([nbbo_just_before, nbbo_just_after, nbbo_1s_after, nbbo_10s_after, nbbo_60s_after], ignore_index=True, sort=False)
    nbbos_calculated = calculate_es_qs_eqf(all_nbbos, trade_price)

    calc = pd.concat([bbos_calculated, nbbos_calculated], ignore_index=True, sort=False)
    
    return calc

Now let’s call the TCA analysis function with the information from our selected trade:

tp = "QQQ 230518C00329000"
tpr = 1.16
tt = 1684338579104713924
c = get_tca_analysis_via_df_single_query(tp, tpr, tt)

Visualize the analysis results

Now let’s create the data frames we use for our visualization. Each data frame contains quotes for one of the five time intervals for each data feed (BBO, NBBO):

bbo = c[c['MarketParticipant'].isin(['BBO'])]
bbo_bef = bbo[bbo['ReceiptTimestamp'] < tt]
bbo_aft_0 = bbo[bbo['ReceiptTimestamp'].between(tt,tt+1000000000)]
bbo_aft_1 = bbo[bbo['ReceiptTimestamp'].between(tt+1000000000,tt+10000000000)]
bbo_aft_10 = bbo[bbo['ReceiptTimestamp'].between(tt+10000000000,tt+60000000000)]
bbo_aft_60 = bbo[bbo['ReceiptTimestamp'] > (tt+60000000000)]

nbbo = c[~c['MarketParticipant'].isin(['BBO'])]
nbbo_bef = nbbo[nbbo['ReceiptTimestamp'] < tt]
nbbo_aft_0 = nbbo[nbbo['ReceiptTimestamp'].between(tt,tt+1000000000)]
nbbo_aft_1 = nbbo[nbbo['ReceiptTimestamp'].between(tt+1000000000,tt+10000000000)]
nbbo_aft_10 = nbbo[nbbo['ReceiptTimestamp'].between(tt+10000000000,tt+60000000000)]
nbbo_aft_60 = nbbo[nbbo['ReceiptTimestamp'] > (tt+60000000000)]

In the following sections, we provide example code to create different visualizations.

Plot QS and NBBO before the trade

Use the following code to plot the quoted spread and NBBO before the trade:

fig = px.bar(title="Quoted Spread Before The Trade",
    x=bbo_bef.MarketParticipant,
    y=bbo_bef['QS'],
    labels={'x': 'Market', 'y':'Quoted Spread'})
fig.add_hline(y=nbbo_bef.iloc[0]['QS'],
    line_width=1, line_dash="dash", line_color="red",
    annotation_text="NBBO", annotation_font_color="red")
%plotly fig

Plot QS for each market and NBBO after the trade

Use the following code to plot the quoted spread for each market and NBBO immediately after the trade:

fig = px.bar(title="Quoted Spread After The Trade",
    x=bbo_aft_0.MarketParticipant,
    y=bbo_aft_0['QS'],
    labels={'x': 'Market', 'y':'Quoted Spread'})
fig.add_hline(
    y=nbbo_aft_0.iloc[0]['QS'],
    line_width=1, line_dash="dash", line_color="red",
    annotation_text="NBBO", annotation_font_color="red")
%plotly fig

Plot QS for each time interval and each market for BBO

Use the following code to plot the quoted spread for each time interval and each market for BBO:

fig = go.Figure(data=[
    go.Bar(name="before trade", x=bbo_bef.MarketParticipant.unique(), y=bbo_bef['QS']),
    go.Bar(name="0s after trade", x=bbo_aft_0.MarketParticipant.unique(), y=bbo_aft_0['QS']),
    go.Bar(name="1s after trade", x=bbo_aft_1.MarketParticipant.unique(), y=bbo_aft_1['QS']),
    go.Bar(name="10s after trade", x=bbo_aft_10.MarketParticipant.unique(), y=bbo_aft_10['QS']),
    go.Bar(name="60s after trade", x=bbo_aft_60.MarketParticipant.unique(), y=bbo_aft_60['QS'])])
fig.update_layout(barmode='group',title="BBO Quoted Spread Per Market/TimeFrame",
    xaxis={'title':'Market'},
    yaxis={'title':'Quoted Spread'})
%plotly fig

Plot ES for each time interval and market for BBO

Use the following code to plot the effective spread for each time interval and market for BBO:

fig = go.Figure(data=[
    go.Bar(name="before trade", x=bbo_bef.MarketParticipant.unique(), y=bbo_bef['ES']),
    go.Bar(name="0s after trade", x=bbo_aft_0.MarketParticipant.unique(), y=bbo_aft_0['ES']),
    go.Bar(name="1s after trade", x=bbo_aft_1.MarketParticipant.unique(), y=bbo_aft_1['ES']),
    go.Bar(name="10s after trade", x=bbo_aft_10.MarketParticipant.unique(), y=bbo_aft_10['ES']),
    go.Bar(name="60s after trade", x=bbo_aft_60.MarketParticipant.unique(), y=bbo_aft_60['ES'])])
fig.update_layout(barmode='group',title="BBO Effective Spread Per Market/TimeFrame",
    xaxis={'title':'Market'}, 
    yaxis={'title':'Effective Spread'})
%plotly fig

Plot EQF for each time interval and market for BBO

Use the following code to plot the effective/quoted spread for each time interval and market for BBO:

fig = go.Figure(data=[
    go.Bar(name="before trade", x=bbo_bef.MarketParticipant.unique(), y=bbo_bef['EQF']),
    go.Bar(name="0s after trade", x=bbo_aft_0.MarketParticipant.unique(), y=bbo_aft_0['EQF']),
    go.Bar(name="1s after trade", x=bbo_aft_1.MarketParticipant.unique(), y=bbo_aft_1['EQF']),
    go.Bar(name="10s after trade", x=bbo_aft_10.MarketParticipant.unique(), y=bbo_aft_10['EQF']),
    go.Bar(name="60s after trade", x=bbo_aft_60.MarketParticipant.unique(), y=bbo_aft_60['EQF'])])
fig.update_layout(barmode='group',title="BBO Effective/Quoted Spread Per Market/TimeFrame",
    xaxis={'title':'Market'}, 
    yaxis={'title':'Effective/Quoted Spread'})
%plotly fig

Athena Spark calculation performance

When you run a code block, Athena Spark automatically determines how many DPUs it requires to complete the calculation. In the last code block, where we call the tca_analysis function, we are actually instructing Spark to process the data, and we then convert the resulting Spark dataframes into Pandas dataframes. This constitutes the most intensive processing part of the analysis, and when Athena Spark runs this block, it shows the progress bar, elapsed time, and how many DPUs are processing data currently. For example, in the following calculation, Athena Spark is utilizing 18 DPUs.

When you configure your Athena Spark notebook, you have the option of setting the maximum number of DPUs that it can use. The default is 20 DPUs, but we tested this calculation with 10, 20, and 40 DPUs to demonstrate how Athena Spark automatically scales to run our analysis. We observed that Athena Spark scales linearly, taking 15 minutes and 21 seconds when the notebook was configured with a maximum of 10 DPUs, 8 minutes and 23 seconds when the notebook was configured with 20 DPUs, and 4 minutes and 44 seconds when the notebook was configured with 40 DPUs. Because Athena Spark charges based on DPU usage, at a per-second granularity, the cost of these calculations is similar, but if you set a higher maximum DPU value, Athena Spark can return the result of the analysis much faster. For more details on Athena Spark pricing please click here.

Conclusion

In this post, we demonstrated how you can use high-fidelity OPRA data from LSEG’s Tick History-PCAP to perform transaction cost analytics using Athena Spark. The availability of OPRA data in a timely manner, complemented with accessibility innovations of AWS Data Exchange for Amazon S3, strategically reduces the time to analytics for firms looking to create actionable insights for critical trading decisions. OPRA generates about 7 TB of normalized Parquet data each day, and managing the infrastructure to provide analytics based on OPRA data is challenging.

Athena’s scalability in handling large-scale data processing for Tick History – PCAP for OPRA data makes it a compelling choice for organizations seeking swift and scalable analytics solutions in AWS. This post shows the seamless interaction between the AWS ecosystem and Tick History-PCAP data and how financial institutions can take advantage of this synergy to drive data-driven decision-making for critical trading and investment strategies.

About the Authors

Pramod Nayak is the Director of Product Management of the Low Latency Group at LSEG. Pramod has over 10 years of experience in the financial technology industry, focusing on software development, analytics, and data management. Pramod is a former software engineer and passionate about market data and quantitative trading.

LakshmiKanth Mannem is a Product Manager in the Low Latency Group of LSEG. He focuses on data and platform products for the low-latency market data industry. LakshmiKanth helps customers build the most optimal solutions for their market data needs.

Vivek Aggarwal is a Senior Data Engineer in the Low Latency Group of LSEG. Vivek works on developing and maintaining data pipelines for processing and delivery of captured market data feeds and reference data feeds.

Alket Memushaj is a Principal Architect in the Financial Services Market Development team at AWS. Alket is responsible for technical strategy, working with partners and customers to deploy even the most demanding capital markets workloads to the AWS Cloud.

AWS Clean Rooms ML helps customers and partners apply ML models without sharing raw data (preview)

2023-11-29 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-ml-helps-customers-and-partners-apply-ml-models-without-sharing-raw-data-preview/

Today, we’re introducing AWS Clean Rooms ML (preview), a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. With this new capability, you can generate predictive insights using ML models while continuing to protect your sensitive data.

During this preview, AWS Clean Rooms ML introduces its ﬁrst model specialized to help companies create lookalike segments for marketing use cases. With AWS Clean Rooms ML lookalike, you can train your own custom model, and you can invite partners to bring a small sample of their records to collaborate and generate an expanded set of similar records while protecting everyone’s underlying data.

In the coming months, AWS Clean Rooms ML will release a healthcare model. This will be the first of many models that AWS Clean Rooms ML will support next year.

AWS Clean Rooms ML helps you to unlock various opportunities for you to generate insights. For example:

Airlines can take signals about loyal customers, collaborate with online booking services, and offer promotions to users with similar characteristics.
Auto lenders and car insurers can identify prospective auto insurance customers who share characteristics with a set of existing lease owners.
Brands and publishers can model lookalike segments of in-market customers and deliver highly relevant advertising experiences.
Research institutions and hospital networks can find candidates similar to existing clinical trial participants to accelerate clinical studies (coming soon).

AWS Clean Rooms ML lookalike modeling helps you apply an AWS managed, ready-to-use model that is trained in each collaboration to generate lookalike datasets in a few clicks, saving months of development work to build, train, tune, and deploy your own model.

How to use AWS Clean Rooms ML to generate predictive insights
Today I will show you how to use lookalike modeling in AWS Clean Rooms ML and assume you have already set up a data collaboration with your partner. If you want to learn how to do that, check out the AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data post.

With your collective data in the AWS Clean Rooms collaboration, you can work with your partners to apply ML lookalike modeling to generate a lookalike segment. It works by taking a small sample of representative records from your data, creating a machine learning (ML) model, then applying the particular model to identify an expanded set of similar records from your business partner’s data.

The following screenshot shows the overall workflow for using AWS Clean Rooms ML.

By using AWS Clean Rooms ML, you don’t need to build complex and time-consuming ML models on your own. AWS Clean Rooms ML trains a custom, private ML model, which saves months of your time while still protecting your data.

Eliminating the need to share data
As ML models are natively built within the service, AWS Clean Rooms ML helps you protect your dataset and customer’s information because you don’t need to share your data to build your ML model.

You can specify the training dataset using the AWS Glue Data Catalog table, which contains user-item interactions.

Under Additional columns to train, you can define numerical and categorical data. This is useful if you need to add more features to your dataset, such as the number of seconds spent watching a video, the topic of an article, or the product category of an e-commerce item.

Applying custom-trained AWS-built models
Once you have defined your training dataset, you can now create a lookalike model. A lookalike model is a machine learning model used to find similar profiles in your partner’s dataset without either party having to share their underlying data with each other.

When creating a lookalike model, you need to specify the training dataset. From a single training dataset, you can create many lookalike models. You also have the flexibility to define the date window in your training dataset using Relative range or Absolute range. This is useful when you have data that is constantly updated within AWS Glue, such as articles read by users.

Easy-to-tune ML models
After you create a lookalike model, you need to configure it to use in AWS Clean Rooms collaboration. AWS Clean Rooms ML provides flexible controls that enable you and your partners to tune the results of the applied ML model to garner predictive insights.

On the Configure lookalike model page, you can choose which Lookalike model you want to use and define the Minimum matching seed size you need. This seed size deﬁnes the minimum number of profiles in your seed data that overlap with profiles in the training data.

You also have the flexibility to choose whether the partner in your collaboration receives metrics in Metrics to share with other members.

With your lookalike models properly configured, you can now make the ML models available for your partners by associating the configured lookalike model with a collaboration.

Creating lookalike segments
Once the lookalike models have been associated, your partners can now start generating insights by selecting Create lookalike segment and choosing the associated lookalike model for your collaboration.

Here on the Create lookalike segment page, your partners need to provide the Seed profiles. Examples of seed profiles include your top customers or all customers who purchased a specific product. The resulting lookalike segment will contain profiles from the training data that are most similar to the profiles from the seed.

Lastly, your partner will get the Relevance metrics as the result of the lookalike segment using the ML models. At this stage, you can use the Score to make a decision.

Export data and use programmatic API
You also have the option to export the lookalike segment data. Once it’s exported, the data is available in JSON format and you can process this output by integrating with AWS Clean Rooms API and your applications.

Join the preview
AWS Clean Rooms ML is now in preview and available via AWS Clean Rooms in US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London). Support for additional models is in the works.

Learn how to apply machine learning with your partners without sharing underlying data on the AWS Clean Rooms ML page.

Happy collaborating!
— Donnie

Prepare your AWS workloads for the “Operational risks and resilience – banks” FINMA Circular

2023-10-31 Margo Cronin

Post Syndicated from Margo Cronin original https://aws.amazon.com/blogs/security/prepare-your-aws-workloads-for-the-operational-risks-and-resilience-banks-finma-circular/

In December 2022, FINMA, the Swiss Financial Market Supervisory Authority, announced a fully revised circular called Operational risks and resilience – banks that will take effect on January 1, 2024. The circular will replace the Swiss Bankers Association’s Recommendations for Business Continuity Management (BCM), which is currently recognized as a minimum standard. The new circular also adopts the revised principles for managing operational risks, and the new principles on operational resilience, that the Basel Committee on Banking Supervision published in March 2021.

In this blog post, we share key considerations for AWS customers and regulated financial institutions to help them prepare for, and align to, the new circular.

AWS previously announced the publication of the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland. The guide refers to certain rules applicable to financial institutions in Switzerland, including banks, insurance companies, stock exchanges, securities dealers, portfolio managers, trustees, and other financial entities that FINMA oversees (directly or indirectly).

FINMA has previously issued the following circulars to help regulated financial institutions understand approaches to due diligence, third party management, and key technical and organizational controls to be implemented in cloud outsourcing arrangements, particularly for material workloads:

2018/03 FINMA Circular Outsourcing – banks and insurers (31.10.2019)
2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Principal 4 Technology Infrastructure
2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Appendix 3 Handling of electronic Client Identifying Data
2013/03 Auditing (04.11.2020) – Information Technology (21.04.2020)
BCM minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

Operational risk management: Critical data

The circular defines critical data as follows:

“Critical data are data that, in view of the institution’s size, complexity, structure, risk profile and business model, are of such crucial significance that they require increased security measures. These are data that are crucial for the successful and sustainable provision of the institution’s services or for regulatory purposes. When assessing and determining the criticality of data, the confidentiality as well as the integrity and availability must be taken into account. Each of these three aspects can determine whether data is classified as critical.”

This definition is consistent with the AWS approach to privacy and security. We believe that for AWS to realize its full potential, customers must have control over their data. This includes the following commitments:

Control over the location of your data
Verifiable control over data access
Ability to encrypt everything everywhere
Resilience of AWS

These commitments further demonstrate our dedication to securing your data: it’s our highest priority. We implement rigorous contractual, technical, and organizational measures to help protect the confidentiality, integrity, and availability of your content regardless of which AWS Region you select. You have complete control over your content through powerful AWS services and tools that you can use to determine where to store your data, how to secure it, and who can access it.

You also have control over the location of your content on AWS. For example, in Europe, at the time of publication of this blog post, customers can deploy their data into any of eight Regions (for an up-to-date list of Regions, see AWS Global Infrastructure). One of these Regions is the Europe (Zurich) Region, also known by its API name ‘eu-central-2’, which customers can use to store data in Switzerland. Additionally, Swiss customers can rely on the terms of the AWS Swiss Addendum to the AWS Data Processing Addendum (DPA), which applies automatically when Swiss customers use AWS services to process personal data under the new Federal Act on Data Protection (nFADP).

AWS continually monitors the evolving privacy, regulatory, and legislative landscape to help identify changes and determine what tools our customers might need to meet their compliance requirements. Maintaining customer trust is an ongoing commitment. We strive to inform you of the privacy and security policies, practices, and technologies that we’ve put in place. Our commitments, as described in the Data Privacy FAQ, include the following:

Access – As a customer, you maintain full control of your content that you upload to the AWS services under your AWS account, and responsibility for configuring access to AWS services and resources. We provide an advanced set of access, encryption, and logging features to help you do this effectively (for example, AWS Identity and Access Management, AWS Organizations, and AWS CloudTrail). We provide APIs that you can use to configure access control permissions for the services that you develop or deploy in an AWS environment. We never use your content or derive information from it for marketing or advertising purposes.
Storage – You choose the AWS Regions in which your content is stored. You can replicate and back up your content in more than one Region. We will not move or replicate your content outside of your chosen AWS Regions except as agreed with you.
Security – You choose how your content is secured. We offer you industry-leading encryption features to protect your content in transit and at rest, and we provide you with the option to manage your own encryption keys. These data protection features include:
- Data encryption capabilities available in over 100 AWS services.
- Flexible key management options using AWS Key Management Service (AWS KMS), so you can choose whether to have AWS manage your encryption keys or to keep complete control over your keys.
Disclosure of customer content – We will not disclose customer content unless we’re required to do so to comply with the law or a binding order of a government body. If a governmental body sends AWS a demand for your customer content, we will attempt to redirect the governmental body to request that data directly from you. If compelled to disclose your customer content to a governmental body, we will give you reasonable notice of the demand to allow the customer to seek a protective order or other appropriate remedy, unless AWS is legally prohibited from doing so.
Security assurance – We have developed a security assurance program that uses current recommendations for global privacy and data protection to help you operate securely on AWS, and to make the best use of our security control environment. These security protections and control processes are independently validated by multiple third-party independent assessments, including the FINMA International Standard on Assurance Engagements (ISAE) 3000 Type II attestation report.

Additionally, FINMA guidelines lay out requirements for the written agreement between a Swiss financial institution and its service provider, including access and audit rights. For Swiss financial institutions that run regulated workloads on AWS, we offer the Swiss Financial Services Addendum to address the contractual and audit requirements of the FINMA guidelines. We also provide these institutions the ability to comply with the audit requirements in the FINMA guidelines through the AWS Security & Audit Series, including participation in an Audit Symposium, to facilitate customer audits. To help align with regulatory requirements and expectations, our FINMA addendum and audit program incorporate feedback that we’ve received from a variety of financial supervisory authorities across EU member states. To learn more about the Swiss Financial Services addendum or about the audit engagements offered by AWS, reach out to your AWS account team.

Resilience

Customers need control over their workloads and high availability to help prepare for events such as supply chain disruptions, network interruptions, and natural disasters. Each AWS Region is composed of multiple Availability Zones (AZs). An Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. To better isolate issues and achieve high availability, you can partition applications across multiple AZs in the same Region. If you are running workloads on premises or in intermittently connected or remote use cases, you can use our services that provide specific capabilities for offline data and remote compute and storage. We will continue to enhance our range of sovereign and resilient options, to help you sustain operations through disruption or disconnection.

FINMA incorporates the principles of operational resilience in the newest circular 2023/01. In line with the efforts of the European Commission’s proposal for the Digital Operational Resilience Act (DORA), FINMA outlines requirements for regulated institutions to identify critical functions and their tolerance for disruption. Continuity of service, especially for critical economic functions, is a key prerequisite for financial stability. AWS recognizes that financial institutions need to comply with sector-specific regulatory obligations and requirements regarding operational resilience. AWS has published the whitepaper Amazon Web Services’ Approach to Operational Resilience in the Financial Sector and Beyond, in which we discuss how AWS and customers build for resiliency on the AWS Cloud. AWS provides resilient infrastructure and services, which financial institution customers can rely on as they design their applications to align with FINMA regulatory and compliance obligations.

AWS previously announced the third issuance of the FINMA ISAE 3000 Type II attestation report. Customers can access the entire report in AWS Artifact. To learn more about the list of certified services and Regions, see the FINMA ISAE 3000 Type 2 Report and AWS Services in Scope for FINMA.

AWS is committed to adding new services into our future FINMA program scope based on your architectural and regulatory needs. If you have questions about the FINMA report, or how your workloads on AWS align to the FINMA obligations, contact your AWS account team. We will also help support customers as they look for new ways to experiment, remain competitive, meet consumer expectations, and develop new products and services on AWS that align with the new regulatory framework.

To learn more about our compliance, security programs and common privacy and data protection considerations, see AWS Compliance Programs and the dedicated AWS Compliance Center for Switzerland. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security, Identity, & Compliance re:Post or contact AWS Support.

New – Move Payment Processing to the Cloud with AWS Payment Cryptography

2023-06-12 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-move-payment-processing-to-the-cloud-with-aws-payment-cryptography/

Cryptography is everywhere in our daily lives. If you’re reading this blog, you’re using HTTPS, an extension of HTTP that uses encryption to secure communications. On AWS, multiple services and capabilities help you manage keys and encryption, such as:

AWS Key Management Service (AWS KMS), which you can use to create and protect keys to encrypt or digitally sign your data.
AWS CloudHSM, which you can use to manage single-tenant hardware security modules (HSMs).

HSMs are physical devices that securely protect cryptographic operations and the keys used by these operations. HSMs can help you meet your corporate, contractual, and regulatory compliance requirements. With CloudHSM, you have access to general-purpose HSMs. When payments are involved, there are specific payment HSMs that offer capabilities such as generating and validating the personal identification number (PIN) and the security code of a credit or debit card.

Today, I am happy to share the availability of AWS Payment Cryptography, an elastic service that manages payment HSMs and keys for payment processing applications in the cloud.

Applications using payments HSMs have challenging requirements because payment processing is complex, time sensitive, and highly regulated and requires the interaction of multiple financial service providers and payment networks. Every time you make a payment, data is exchanged between two or more financial service providers and must be decrypted, transformed, and encrypted again with a unique key at each step.

This process requires highly performant cryptography capabilities and key management procedures between each payment service provider. These providers might have thousands of keys to protect, manage, rotate, and audit, making the overall process expensive and difficult to scale. To add to that, payment HSMs historically employ complex and error-prone processes, such as exchanging keys in a secure room using multiple hand-carried paper forms, each with separate key components printed on them.

Introducing AWS Payment Cryptography
AWS Payment Cryptography simplifies your implementation of cryptographic functions and key management used to secure data in payment processing in accordance with various payment card industry (PCI) standards.

With AWS Payment Cryptography, you can eliminate the need to provision and manage on-premises payment HSMs and use the provided tools to avoid error-prone key exchange processes. For example, with AWS Payment Cryptography, payment and financial service providers can begin development within minutes and plan to exchange keys electronically, eliminating manual processes.

To provide its elastic cryptographic capabilities in a compliant manner, AWS Payment Cryptography uses HSMs with PCI PTS HSM device approval. These capabilities include encryption and decryption of card data, key creation, and pin translation. AWS Payment Cryptography is also designed in accordance with PCI security standards such as PCI DSS, PCI PIN, and PCI P2PE, and it provides evidence and reporting to help meet your compliance needs.

You can import and export symmetric keys between AWS Payment Cryptography and on-premises HSMs under key encryption key (KEKs) using the ANSI X9 TR-31 protocol. You can also import and export symmetric KEKs with other systems and devices using the ANSI X9 TR-34 protocol, which allows the service to exchange symmetric keys using asymmetric techniques.

To simplify moving consumer payment processing to the cloud, existing card payment applications can use AWS Payment Cryptography through the AWS SDKs. In this way, you can use your favorite programming language, such as Java or Python, instead of vendor-specific ASCII interfaces over TCP sockets, as is common with payment HSMs.

Access can be authorized using AWS Identity and Access Management (IAM) identity-based policies, where you can specify which actions and resources are allowed or denied and under which conditions.

Monitoring is important to maintain the reliability, availability, and performance needed by payment processing. With AWS Payment Cryptography, you can use Amazon CloudWatch, AWS CloudTrail, and Amazon EventBridge to understand what is happening, report when something is wrong, and take automatic actions when appropriate.

Let’s see how this works in practice.

Using AWS Payment Cryptography
Using the AWS Command Line Interface (AWS CLI), I create a double-length 3DES key to be used as a card verification key (CVK). A CVK is a key used for generating and verifying card security codes such as CVV, CVV2, and similar values.

Note that there are two commands for the CLI (and similarly two endpoints for API and SDKs):

payment-cryptography for control plane operation such as listing and creating keys and aliases.
payment-cryptography-data for cryptographic operations that use keys, for example, to generate PIN or card validation data.

Creating a key is a control plane operation:

aws payment-cryptography create-key \
    --no-exportable \
    --key-attributes KeyAlgorithm=TDES_2KEY,
                     KeyUsage=TR31_C0_CARD_VERIFICATION_KEY,
                     KeyClass=SYMMETRIC_KEY,
                     KeyModesOfUse='{Generate=true,Verify=true}'

{
    "Key": {
        "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
        "KeyAttributes": {
            "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
            "KeyClass": "SYMMETRIC_KEY",
            "KeyAlgorithm": "TDES_2KEY",
            "KeyModesOfUse": {
                "Encrypt": false,
                "Decrypt": false,
                "Wrap": false,
                "Unwrap": false,
                "Generate": true,
                "Sign": false,
                "Verify": true,
                "DeriveKey": false,
                "NoRestrictions": false
            }
        },
        "KeyCheckValue": "B2DD4E",
        "KeyCheckValueAlgorithm": "ANSI_X9_24",
        "Enabled": true,
        "Exportable": false,
        "KeyState": "CREATE_COMPLETE",
        "KeyOrigin": "AWS_PAYMENT_CRYPTOGRAPHY",
        "CreateTimestamp": "2023-05-26T14:25:48.240000+01:00",
        "UsageStartTimestamp": "2023-05-26T14:25:48.220000+01:00"
    }
}

To reference this key in the next steps, I can use the Amazon Resource Name (ARN) as found in the KeyARN property, or I can create an alias. An alias is a friendly name that lets me refer to a key without having to use the full ARN. I can update an alias to refer to a different key. When I need to replace a key, I can just update the alias without having to change the configuration or the code of your applications. To be recognized easily, alias names start with alias/. For example, the following command creates the alias alias/my-key for the key I just created:

aws payment-cryptography create-alias --alias-name alias/my-key \
    --key-arn arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h

{
    "Alias": {
        "AliasName": "alias/my-key",
        "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h"
    }
}

Before I start using the new key, I list all my keys to check their status:

aws payment-cryptography list-keys

{
    "Keys": [
        {
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123421341234:key/42cdc4ocf45mg54h",
            "KeyAttributes": {
                "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
                "KeyClass": "SYMMETRIC_KEY",
                "KeyAlgorithm": "TDES_2KEY",
                "KeyModesOfUse": {
                    "Encrypt": false,
                    "Decrypt": false,
                    "Wrap": false,
                    "Unwrap": false,
                    "Generate": true,
                    "Sign": false,
                    "Verify": true,
                    "DeriveKey": false,
                    "NoRestrictions": false
                }
            },
            "KeyCheckValue": "B2DD4E",
            "Enabled": true,
            "Exportable": false,
            "KeyState": "CREATE_COMPLETE"
        },
        {
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/ok4oliaxyxbjuibp",
            "KeyAttributes": {
                "KeyUsage": "TR31_C0_CARD_VERIFICATION_KEY",
                "KeyClass": "SYMMETRIC_KEY",
                "KeyAlgorithm": "TDES_2KEY",
                "KeyModesOfUse": {
                    "Encrypt": false,
                    "Decrypt": false,
                    "Wrap": false,
                    "Unwrap": false,
                    "Generate": true,
                    "Sign": false,
                    "Verify": true,
                    "DeriveKey": false,
                    "NoRestrictions": false
                }
            },
            "KeyCheckValue": "905848",
            "Enabled": true,
            "Exportable": false,
            "KeyState": "DELETE_PENDING"
        }
    ]
}

As you can see, there is another key I created before, which has since been deleted. When a key is deleted, it is marked for deletion (DELETE_PENDING). The actual deletion happens after a configurable period (by default, 7 days). This is a safety mechanism to prevent the accidental or malicious deletion of a key. Keys marked for deletion are not available for use but can be restored.

In a similar way, I list all my aliases to see to which keys they are they referring:

aws payment-cryptography list-aliases

{
    "Aliases": [
        {
            "AliasName": "alias/my-key",
            "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h"
        }
    ]
}

Now, I use the key to generate a card security code with the CVV2 authentication system. You might be familiar with CVV2 numbers that are usually written on the back of a credit card. This is the way they are computed. I provide as input the primary account number of the credit card, the card expiration date, and the key from the previous step. To specify the key, I use its alias. This is a data plane operation:

aws payment-cryptography-data generate-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --generation-attributes CardVerificationValue2={CardExpiryDate=0124}

{
    "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
    "KeyCheckValue": "B2DD4E",
    "ValidationData": "343"
}

I take note of the three digits in the ValidationData property. When processing a payment, I can verify that the card data value is correct:

aws payment-cryptography-data verify-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --verification-attributes CardVerificationValue2={CardExpiryDate=0124} \
    --validation-data 343

{
    "KeyArn": "arn:aws:payment-cryptography:us-west-2:123412341234:key/42cdc4ocf45mg54h",
    "KeyCheckValue": "B2DD4E"
}

The verification is successful, and in return I get back the same KeyCheckValue as when I generated the validation data.

As you might expect, if I use the wrong validation data, the verification is not successful, and I get back an error:

aws payment-cryptography-data verify-card-validation-data \
    --key-identifier alias/my-key \
    --primary-account-number=171234567890123 \
    --verification-attributes CardVerificationValue2={CardExpiryDate=0124} \
    --validation-data 999

An error occurred (com.amazonaws.paymentcryptography.exception#VerificationFailedException)
when calling the VerifyCardValidationData operation:
Card validation data verification failed

In the AWS Payment Cryptography console, I choose View Keys to see the list of keys.

Optionally, I can enable more columns, for example, to see the key type (symmetric/asymmetric) and the algorithm used.

I choose the key I used in the previous example to get more details. Here, I see the cryptographic configuration, the tags assigned to the key, and the aliases that refer to this key.

AWS Payment Cryptography supports many more operations than the ones I showed here. For this walkthrough, I used the AWS CLI. In your applications, you can use AWS Payment Cryptography through any of the AWS SDKs.

Availability and Pricing
AWS Payment Cryptography is available today in the following AWS Regions: US East (N. Virginia) and US West (Oregon).

With AWS Payment Cryptography, you only pay for what you use based on the number of active keys and API calls with no up-front commitment or minimum fee. For more information, see AWS Payment Cryptography pricing.

AWS Payment Cryptography removes your dependencies on dedicated payment HSMs and legacy key management systems, simplifying your integration with AWS native APIs. In addition, by operating the entire payment application in the cloud, you can minimize round-trip communications and latency.

Move your payment processing applications to the cloud with AWS Payment Cryptography.

— Danilo

Patterns for enterprise data sharing at scale

2023-02-27 Venkata Sistla

Post Syndicated from Venkata Sistla original https://aws.amazon.com/blogs/big-data/patterns-for-enterprise-data-sharing-at-scale/

Data sharing is becoming an important element of an enterprise data strategy. AWS services like AWS Data Exchange provide an avenue for companies to share or monetize their value-added data with other companies. Some organizations would like to have a data sharing platform where they can establish a collaborative and strategic approach to exchange data with a restricted group of companies in a closed, secure, and exclusive environment. For example, financial services companies and their auditors, or manufacturing companies and their supply chain partners. This fosters development of new products and services and helps improve their operational efficiency.

Data sharing is a team effort, it’s important to note that in addition to establishing the right infrastructure, successful data sharing also requires organizations to ensure that business owners sponsor data sharing initiatives. They also need to ensure that data is of high quality. Data platform owners and security teams should encourage proper data use and fix any privacy and confidentiality issues.

This blog discusses various data sharing options and common architecture patterns that organizations can adopt to set up their data sharing infrastructure based on AWS service availability and data compliance.

Data sharing options and data classification types

Organizations operate across a spectrum of security compliance constraints. For some organizations, it’s possible to use AWS services like AWS Data Exchange. However, organizations working in heavily regulated industries like federal agencies or financial services might be limited by the allow listed AWS service options. For example, if an organization is required to operate in a Fedramp Medium or Fedramp High environment, their options to share data may be limited by the AWS services that are available and have been allow listed. Service availability is based on platform certification by AWS, and allow listing is based on the organizations defining their security compliance architecture and guidelines.

The kind of data that the organization wants to share with its partners may also have an impact on the method used for data sharing. Complying with data classification rules may further limit their choice of data sharing options they may choose.

The following are some general data classification types:

Public data – Important information, though often freely available for people to read, research, review and store. It typically has the lowest level of data classification and security.
Private data – Information you might want to keep private like email inboxes, cell phone content, employee identification numbers, or employee addresses. If private data were shared, destroyed, or altered, it might pose a slight risk to an individual or the organization.
Confidential or restricted data – A limited group of individuals or parties can access sensitive information often requiring special clearance or special authorization. Confidential or restricted data access might involve aspects of identity and authorization management. Examples of confidential data include Social Security numbers and vehicle identification numbers.

The following is a sample decision tree that you can refer to when choosing your data sharing option based on service availability, classification type, and data format (structured or unstructured). Other factors like usability, multi-partner accessibility, data size, consumption patterns like bulk load/API access, and more may also affect the choice of data sharing pattern.

In the following sections, we discuss each pattern in more detail.

Pattern 1: Using AWS Data Exchange

AWS Data Exchange makes exchanging data easier, helping organizations lower costs, become more agile, and innovate faster. Organizations can choose to share data privately using AWS Data Exchange with their external partners. AWS Data Exchange offers perimeter controls that are applied at identity and resource levels. These controls decide which external identities have access to specific data resources. AWS Data Exchange provides multiple different patterns for external parties to access data, such as the following:

AWS Data Exchange for Amazon Redshift
AWS Data Exchange for AWS Lake Formation (currently in preview)
AWS Data Exchange for Data APIs
AWS Data Exchange for data files
AWS Data Exchange for Amazon S3 (currently in preview)

The following diagram illustrates an example architecture.

With AWS Data Exchange, once the dataset to share (or sell) is configured, AWS Data Exchange automatically manages entitlements (and billing) between the producer and the consumer. The producer doesn’t have to manage policies, set up new access points, or create new Amazon Redshift data shares for each consumer, and access is automatically revoked if the subscription ends. This can significantly reduce the operational overhead in sharing data.

Pattern 2: Using AWS Lake Formation for centralized access management

You can use this pattern in cases where both the producer and consumer are on the AWS platform with an AWS account that is enabled to use AWS Lake Formation. This pattern provides a no-code approach to data sharing. The following diagram illustrates an example architecture.

In this pattern, the central governance account has Lake Formation configured for managing access across the producer’s org accounts. Resource links from the production account Amazon Simple Storage Service (Amazon S3) bucket are created in Lake Formation. The producer grants Lake Formation permissions on an AWS Glue Data Catalog resource to an external account, or directly to an AWS Identity and Access Management (IAM) principal in another account. Lake Formation uses AWS Resource Access Manager (AWS RAM) to share the resource. If the grantee account is in the same organization as the grantor account, the shared resource is available immediately to the grantee. If the grantee account is not in the same organization, AWS RAM sends an invitation to the grantee account to accept or reject the resource grant. To make the shared resource available, the consumer administrator in the grantee account must use the AWS RAM console or AWS Command Line Interface (AWS CLI) to accept the invitation.

Authorized principals can share resources explicitly with an IAM principal in an external account. This feature is useful when the producer wants to have control over who in the external account can access the resources. The permissions the IAM principal receives are a union of direct grants and the account-level grants that are cascaded down to the principals. The data lake administrator of the recipient account can view the direct cross-account grants, but can’t revoke permissions.

Pattern 3: Using AWS Lake Formation from the producer external sharing account

The producer may have stringent security requirements where no external consumer should access their production account or their centralized governance account. They may also not have Lake Formation enabled on their production platform. In such cases, as shown in the following diagram, the producer production account (Account A) is dedicated to its internal organization users. The producer creates another account, the producer external sharing account (Account B), which is dedicated for external sharing. This gives the producer more latitude to create specific policies for specific organizations.

The following architecture diagram shows an overview of the pattern.

The producer implements a process to create an asynchronous copy of data in Account B. The bucket can be configured for Same Region Replication (SRR) or Cross Region Replication (CRR) for objects that need to be shared. This facilitates automated refresh of data to the external account to the “External Published Datasets” S3 bucket without having to write any code.

Creating a copy of the data allows the producer to add another degree of separation between the external consumer and its production data. It also helps meet any compliance or data sovereignty requirements.

Lake Formation is set up on Account B, and the administrator creates resources links for the “External Published Datasets” S3 bucket in its account to grant access. The administrator follows the same process to grant access as described earlier.

Pattern 4: Using Amazon Redshift data sharing

This pattern is ideally suited for a producer who has most of their published data products on Amazon Redshift. This pattern also requires the producer’s external sharing account (Account B) and the consumer account (Account C) to have an encrypted Amazon Redshift cluster or Amazon Redshift Serverless endpoint that meets the prerequisites for Amazon Redshift data sharing.

The following architecture diagram shows an overview of the pattern.

Two options are possible depending on the producer’s compliance constraints:

Option A – The producer enables data sharing directly on the production Amazon Redshift cluster.
Option B – The producer may have constraints with respect to sharing the production cluster. The producer creates a simple AWS Glue job that copies data from the Amazon Redshift cluster in the production Account A to the Amazon Redshift cluster in the external Account B. This AWS Glue job can be scheduled to refresh data as needed by the consumer. When the data is available in Account B, the producer can create multiple views and multiple data shares as needed.

In both options, the producer maintains complete control over what data is being shared, and the consumer admin maintains full control over who can access the data within their organization.

After both the producer and consumer admins approve the data sharing request, the consumer user can access this data as if it were part of their own account without have to write any additional code.

Pattern 5: Sharing data securely and privately using APIs

You can adopt this pattern when the external partner doesn’t have a presence on AWS. You can also use this pattern when published data products are spread across various services like Amazon S3, Amazon Redshift, Amazon DynamoDB, and Amazon OpenSearch Service but the producer would like to maintain a single data sharing interface.

Here’s an example use case: Company A would like to share some of its log data in near-real time with its partner Company B, who uses this data to generate predictive insights for Company A. Company A stores this data in Amazon Redshift. The company wants to share this transactional information with its partner after masking the personally identifiable information (PII) in a cost-effective and secure way to generate insights. Company B doesn’t use the AWS platform.

Company A establishes a microbatch process using an AWS Lambda function or AWS Glue that queries Amazon Redshift to get incremental log data, applies the rules to redact the PII, and loads this data to the “Published Datasets” S3 bucket. This instantiates an SRR/CRR process that refreshes this data in the “External Sharing” S3 bucket.

The following diagram shows how the consumer can then use an API-based approach to access this data.

The workflow contains the following steps:

An HTTPS API request is sent from the API consumer to the API proxy layer.
The HTTPS API request is forwarded from the API proxy to Amazon API Gateway in the external sharing AWS account.
Amazon API Gateway calls the request receiver Lambda function.
The request receiver function writes the status to a DynamoDB control table.
A second Lambda function, the poller, checks the status of the results in the DynamoDB table.
The poller function fetches results from Amazon S3.
The poller function sends a presigned URL to download the file from the S3 bucket to the requestor via Amazon Simple Email Service (Amazon SES).
The requestor downloads the file using the URL.
The network perimeter AWS account only allows egress internet connection.
The API proxy layer enforces both the egress security controls and perimeter firewall before the traffic leaves the producer’s network perimeter.
The AWS Transit Gateway security egress VPC routing table only allows connectivity from the required producer’s subnet, while preventing internet access.

Pattern 6: Using Amazon S3 access points

Data scientists may need to work collaboratively on image, videos, and text documents. Legal and audit groups may want to share reports and statements with the auditing agencies. This pattern discusses an approach to sharing such documents. The pattern assumes that the external partners are also on AWS. Amazon S3 access points allow the producer to share access with their consumer by setting up cross-account access without having to edit bucket policies.

Access points are named network endpoints that are attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject. Each access point has distinct permissions and network controls that Amazon S3 applies for any request that is made through that access point. Each access point enforces a customized access point policy that works in conjunction with the bucket policy attached to the underlying bucket.

The following architecture diagram shows an overview of the pattern.

The producer creates an S3 bucket and enables the use of access points. As part of the configuration, the producer specifies the consumer account, IAM role, and privileges for the consumer IAM role.

The consumer users with the IAM role in the consumer account can access the S3 bucket via the internet or restricted to an Amazon VPC via VPC endpoints and AWS PrivateLink.

Conclusion

Each organization has its unique set of constraints and requirements that it needs to fulfill to set up an efficient data sharing solution. In this post, we demonstrated various options and best practices available to organizations. The data platform owner and security team should work together to assess what works best for your specific situation. Your AWS account team is also available to help.

Related resources

For more information on related topics, refer to the following:

About the Authors

Venkata Sistla is a Cloud Architect – Data & Analytics at AWS. He specializes in building data processing capabilities and helping customers remove constraints that prevent them from leveraging their data to develop business insights.

Santosh Chiplunkar is a Principal Resident Architect at AWS. He has over 20 years of experience helping customers solve their data challenges. He helps customers develop their data and analytics strategy and provides them with guidance on how to make it a reality.

2022 FINMA ISAE 3000 Type II attestation report now available with 154 services in scope

2022-12-23 Daniel Fuertes

Post Syndicated from Daniel Fuertes original https://aws.amazon.com/blogs/security/2022-finma-isae-3000-type-ii-attestation-report-now-available-with-154-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the third issuance of the Swiss Financial Market Supervisory Authority (FINMA) International Standard on Assurance Engagements (ISAE) 3000 Type II attestation report. The scope of the report covers a total of 154 services and 24 global AWS Regions.

The latest FINMA ISAE 3000 Type II report covers the period from October 1, 2021, to September 30, 2022. AWS continues to assure Swiss financial industry customers that our control environment is capable of effectively addressing key operational, outsourcing, and business continuity management risks.

FINMA circulars

The report covers the five core FINMA circulars regarding outsourcing arrangements to the cloud. FINMA circulars help Swiss-regulated financial institutions to understand the approaches FINMA takes when implementing due diligence, third-party management, and key technical and organizational controls for cloud outsourcing arrangements, particularly for material workloads.

The scope of the report covers the following requirements of the FINMA circulars:

2018/03 Outsourcing – Banks, insurance companies and selected financial institutions under FinIA
2008/21 Operational Risks – Banks – Principle 4 Technology Infrastructure (31.10.2019)
2008/21 Operational Risks – Banks – Appendix 3 Handling of Electronic Client Identifying Data (31.10.2019)
2013/03 Auditing – Information Technology (04.11.2020)
2008/10 Self-regulation as a minimum standard – Minimum Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

It is our pleasure to announce the addition of 16 services and two Regions to the FINMA ISAE 3000 Type II attestation scope. The following are a few examples of the additional security services in scope:

AWS CloudShell – A browser-based shell that makes it simple to manage, explore, and interact with your AWS resources. With CloudShell, you can quickly run scripts with the AWS Command Line Interface (AWS CLI), experiment with AWS service APIs by using the AWS SDKs, or use a range of other tools to be productive.
Amazon HealthLake – A HIPAA-eligible service that offers healthcare and life sciences companies a chronological view of individual or patient population health data for query and analytics at scale.
AWS IoT SiteWise – A managed service that simplifies collecting, organizing, and analyzing industrial equipment data.
Amazon DevOps Guru – A service that uses machine learning to detect abnormal operating patterns to help you identify operational issues before they impact your customers.

Customers can continue to reference the FINMA workbooks, which include detailed control mappings for each FINMA circular covered under this audit report, through AWS Artifact. Customers can also find the entire FINMA report on AWS Artifact. To learn more about the list of certified services and Regions, see AWS Compliance Programs and AWS Services in Scope for FINMA.

As always, AWS is committed to adding new services into our future FINMA program scope based on your architectural and regulatory needs. If you have questions about the FINMA report, contact your AWS account team.

If you have feedback about this post, please submit them in the Comments section below.
Want more AWS Security news? Follow us on Twitter.

How AWS Data Lab helped BMW Financial Services design and build a multi-account modern data architecture

2022-09-27 Rahul Shaurya

Post Syndicated from Rahul Shaurya original https://aws.amazon.com/blogs/big-data/how-aws-data-lab-helped-bmw-financial-services-design-and-build-a-multi-account-modern-data-architecture/

This post is co-written by Martin Zoellner, Thomas Ehrlich and Veronika Bogusch from BMW Group.

BMW Group and AWS announced a comprehensive strategic collaboration in 2020. The goal of the collaboration is to further accelerate BMW Group’s pace of innovation by placing data and analytics at the center of its decision-making. A key element of the collaboration is the further development of the Cloud Data Hub (CDH) of BMW Group. This is the central platform for managing company-wide data and data solutions in the cloud. At the AWS re:Invent 2019 session, BMW and AWS demonstrated the new Cloud Data Hub platform by outlining different archetypes of data platforms and then walking through the journey of building BMW Group’s Cloud Data Hub. To learn more about the Cloud Data Hub, refer to BMW Cloud Data Hub: A reference implementation of the modern data architecture on AWS.

As part of this collaboration, BMW Group is migrating hundreds of data sources across several data domains to the Cloud Data Hub. Several of these sources pertain to BMW Financial Services.

In this post, we talk about how the AWS Data Lab is helping BMW Financial Services build a regulatory reporting application for one of the European BMW market using the Cloud Data Hub on AWS.

Solution overview

In the context of regulatory reporting, BMW Financial Services works with critical financial services data that contains personally identifiable information (PII). We need to provide monthly insights on our financial data to one of the European National Regulator, and we also need to be compliant with the Schrems II and GDPR regulations as we process PII data. This requires the PII to be pseudonymized when it’s loaded into the Cloud Data Hub, and it has to be processed further in pseudonymized form. For an overview of pseudonymization process, check out Build a pseudonymization service on AWS to protect sensitive data .

To address these requirements in a precise and efficient way, BMW Financial Services decided to engage with the AWS Data Lab. The AWS Data Lab has two offerings: the Design Lab and the Build Lab.

Design Lab

The Design Lab is a 1-to-2-day engagement for customers who need a real-world architecture recommendation based on AWS expertise, but aren’t ready to build. In the case of BMW Financial Services, before beginning the build phase, it was key to get all the stakeholders in the same room and record all the functional and non-functional requirements introduced by all the different parties that might influence the data platform—from owners of the various data sources to end-users that would use the platform to run analytics and get business insights. Within the scope of the Design Lab, we discussed three use cases:

Regulatory reporting – The top priority for BMW Financial Services was the regulatory reporting use case, which involves collecting and calculating data and reports that will be declared to the National Regulator.
Local data warehouse – For this use case, we need to calculate and store all key performance indicators (KPIs) and key value indicators (KVIs) that will be defined during the project. The historical data needs to be stored, but we need to apply a pseudonymization process to respect GDPR directives. Moreover, historical data has to be accessed on a daily basis through a tableau visualization tool. Regarding the structure, it would be valuable to define two levels (at minimum): one at the contract level to justify the calculation of all KPIs, and another at an aggregated level to optimize restitutions. Personal data is limited in the application, but a reidentification process must be possible for authorized consumption patterns.
Accounting details – This use case is based on the BMW accounting tool IFT, which provides the accounting balance at the contract level from all local market applications. It must run at least once a month. However, if some issues are identified on IFT during closing, we must be able to restart it and erase the previous run. When the month-end closing is complete, this use case has to keep the last accounting balance version generated during the month and store it. In parallel, all accounting balance versions have to be accessible by other applications for queries and be able to retrieve the information for 24 months.

Design Lab Solution Architecture

Based on these requirements, we developed the following architecture during the Design Lab.

This solution contains the following components:

The main data source that hydrates our three use cases is the already available in the Cloud Data Hub. The Cloud Data Hub uses AWS Lake Formation resource links to grant access to the dataset to the consumer accounts.
For standard, periodic ETL (extract, transform, and load) jobs that involve operations such as converting data types, or creating labels based on numerical values or Boolean flags based on a label, we used AWS Glue ETL jobs.
For historical ETL jobs or more complex calculations such as in the account details use case, which may involve huge joins with custom configurations and tuning, we recommended to use Amazon EMR. This gives you the opportunity to control cluster configurations at a fine-grained level.
To store job metadata that enables features such as reprocessing inputs or rerunning failed jobs, we recommended building a data registry. The goal of the data registry is to create a centralized inventory for any data being ingested in the data lake. A schedule-based AWS Lambda function could be triggered to register data landing on the semantic layer of the Cloud Data Hub in a centralized metadata store. We recommended using Amazon DynamoDB for the data registry.
Amazon Simple Storage Service (Amazon S3) serves as the storage mechanism that powers the regulatory reporting use case using the data management framework Apache Hudi. Apache Hudi is useful for our use cases because we need to develop data pipelines where record-level insert, update, upsert, and delete capabilities are desirable. Hudi tables are supported by both Amazon EMR and AWS Glue jobs via the Hudi connector, along with query engines such as Amazon Athena and Amazon Redshift Spectrum.
As part of the data storing process in the regulatory reporting S3 bucket, we can populate the AWS Glue Data Catalog with the required metadata.
Athena provides an ad hoc query environment for interactive analysis of data stored in Amazon S3 using standard SQL. It has an out-of-the-box integration with the AWS Glue Data Catalog.
For the data warehousing use case, we need to first de-normalize data to create a dimensional model that enables optimized analytical queries. For that conversion, we use AWS Glue ETL jobs.
Dimensional data marts in Amazon Redshift enable our dashboard and self-service reporting needs. Data in Amazon Redshift is organized into several subject areas that are aligned with the business needs, and a dimensional model allows for cross-subject area analysis.
As a by-product of creating an Amazon Redshift cluster, we can use Redshift Spectrum to access data in the regulatory reporting bucket of the architecture. It acts as a front to access more granular data without actually loading it in the Amazon Redshift cluster.
The data provided to the Cloud Data Hub contains personal data that is pseudonymized. However, we need our pseudonymized columns to be re-personalized when visualizing them on Tableau or when generating CSV reports. Both Athena and Amazon Redshift support Lambda UDFs, which can be used to access Cloud Data Hub PII APIs to re-personalize the pseudonymized columns before presenting them to end-users.
Both Athena and Amazon Redshift can be accessed via JDBC (Java Database Connectivity) to provide access to data consumers.
We can use a Python shell job in AWS Glue to run a query against either of our analytics solutions, convert the results to the required CSV format, and store them to a BMW secured folder.
Any business intelligence (BI) tool deployed on premises can connect to both Athena and Amazon Redshift and use their query engines to perform any heavy computation before it receives the final data to fuel its dashboards.
For the data pipeline orchestration, we recommended using AWS Step Functions because of its low-code development experience and its full integration with all the other components discussed.

With the preceding architecture as our long-term target state, we concluded the Design Lab and decided to return for a Build Lab to accelerate solution development.

Preparing for Build Lab

The typical preparation of a Build Lab that follows a Design Lab involves identifying a few examples of common use case patterns, typically the more complex ones. To maximize the success in the Build Lab, we reduce the long-term target architecture to a subset of components that addresses those examples and can be implemented within a 3-to-5-day intense sprint.

For a successful Build Lab, we also need to identify and resolve any external dependencies, such as network connectivity to data sources and targets. If that isn’t feasible, then we find meaningful ways to mock them. For instance, to make the prototype closer to what the production environment would look like, we decided to use separate AWS accounts for each use case, based on the existing team structure of BMW, and use a consumer S3 bucket instead of BMW network-attached storage (NAS).

Build Lab

The BMW team set aside 4 days for their Build Lab. During that time, their dedicated Data Lab Architect worked alongside the team, helping them to build the following prototype architecture.

Build Lab Solution

This solution includes the following components:

The first step was to synchronize the AWS Glue Data Catalog of the Cloud Data Hub and regulatory reporting accounts.
AWS Glue jobs running on the regulatory reporting account had access to the data in the Cloud Data Hub resource accounts. During the Build Lab, the BMW team implemented ETL jobs for six tables, addressing insert, update, and delete record requirements using Hudi.
The result of the ETL jobs is stored in the data lake layer stored in the regulatory reporting S3 bucket as Hudi tables that are catalogued in the AWS Glue Data Catalog and can be consumed by multiple AWS services. The bucket is encrypted using AWS Key Management Service (AWS KMS).
Athena is used to run exploratory queries on the data lake.
To demonstrate the cross-account consumption pattern, we created an Amazon Redshift cluster on it, created external tables from the Data Catalog, and used Redshift Spectrum to query the data. To enable cross-account connectivity between the subnet group of the Data Catalog of the regulatory reporting account and the subnet group of the Amazon Redshift cluster on the local data warehouse account, we had to enable VPC peering. To accelerate and optimize the implementation of these configurations during the Build Lab, we received support from an AWS networking subject matter expert, who ran a valuable session, during which the BMW team understood the networking details of the architecture.
For data consumption, the BMW team implemented an AWS Glue Python shell job that connected to Amazon Redshift or Athena using a JDBC connection, ran a query, and stored the results in the reporting bucket as a CSV file, which would later be accessible by the end-users.
End-users can also directly connect to both Athena and Amazon Redshift using a JDBC connection.
We decided to orchestrate the AWS Glue ETL jobs using AWS Glue Workflows. We used the resulting workflow for the end-of-lab demo.

With that, we completed all the goals we had set up and concluded the 4-day Build Lab.

Conclusion

In this post, we walked you through the journey the BMW Financial Services team took with the AWS Data Lab team to participate in a Design Lab to identify a best-fit architecture for their use cases, and the subsequent Build Lab to implement prototypes for regulatory reporting in one of the European BMW market.

To learn more about how AWS Data Lab can help you turn your ideas into solutions, visit AWS Data Lab.

Special thanks to everyone who contributed to the success of the Design and Build Lab: Lionel Mbenda, Mario Robert Tutunea, Marius Abalarus, Maria Dejoie.

About the authors

Martin Zoellner is an IT Specialist at BMW Group. His role in the project is Subject Matter Expert for DevOps and ETL/SW Architecture.

Thomas Ehrlich is the functional maintenance manager of Regulatory Reporting application in one of the European BMW market.

Veronika Bogusch is an IT Specialist at BMW. She initiated the rebuild of the Financial Services Batch Integration Layer via the Cloud Data Hub. The ingested data assets are the base for the Regulatory Reporting use case described in this article.

George Komninos is a solutions architect for the Amazon Web Services (AWS) Data Lab. He helps customers convert their ideas to a production-ready data product. Before AWS, he spent three years at Alexa Information domain as a data engineer. Outside of work, George is a football fan and supports the greatest team in the world, Olympiacos Piraeus.

Rahul Shaurya is a Senior Big Data Architect with AWS Professional Services. He helps and works closely with customers building data platforms and analytical applications on AWS. Outside of work, Rahul loves taking long walks with his dog Barney.

How Epos Now modernized their data platform by building an end-to-end data lake with the AWS Data Lab

2022-08-01 Debadatta Mohapatra

Post Syndicated from Debadatta Mohapatra original https://aws.amazon.com/blogs/big-data/how-epos-now-modernized-their-data-platform-by-building-an-end-to-end-data-lake-with-the-aws-data-lab/

Epos Now provides point of sale and payment solutions to over 40,000 hospitality and retailers across 71 countries. Their mission is to help businesses of all sizes reach their full potential through the power of cloud technology, with solutions that are affordable, efficient, and accessible. Their solutions allow businesses to leverage actionable insights, manage their business from anywhere, and reach customers both in-store and online.

Epos Now currently provides real-time and near-real-time reports and dashboards to their merchants on top of their operational database (Microsoft SQL Server). With a growing customer base and new data needs, the team started to see some issues in the current platform.

First, they observed performance degradation for serving the reporting requirements from the same OLTP database with the current data model. A few metrics that needed to be delivered in real time (seconds after a transaction was complete) and a few metrics that needed to be reflected in the dashboard in near-real-time (minutes) took several attempts to load in the dashboard.

This started to cause operational issues for their merchants. The end consumers of reports couldn’t access the dashboard in a timely manner.

Cost and scalability also became a major problem because one single database instance was trying to serve many different use cases.

Epos Now needed a strategic solution to address these issues. Additionally, they didn’t have a dedicated data platform for doing machine learning and advanced analytics use cases, so they decided on two parallel strategies to resolve their data problems and better serve merchants:

The first was to rearchitect the near-real-time reporting feature by moving it to a dedicated Amazon Aurora PostgreSQL-Compatible Edition database, with a specific reporting data model to serve to end consumers. This will improve performance, uptime, and cost.
The second was to build out a new data platform for reporting, dashboards, and advanced analytics. This will enable use cases for internal data analysts and data scientists to experiment and create multiple data products, ultimately exposing these insights to end customers.

In this post, we discuss how Epos Now designed the overall solution with support from the AWS Data Lab. Having developed a strong strategic relationship with AWS over the last 3 years, Epos Now opted to take advantage of the AWS Data lab program to speed up the process of building a reliable, performant, and cost-effective data platform. The AWS Data Lab program offers accelerated, joint-engineering engagements between customers and AWS technical resources to create tangible deliverables that accelerate data and analytics modernization initiatives.

Working with an AWS Data Lab Architect, Epos Now commenced weekly cadence calls to come up with a high-level architecture. After the objective, success criteria, and stretch goals were clearly defined, the final step was to draft a detailed task list for the upcoming 3-day build phase.

Overview of solution

As part of the 3-day build exercise, Epos Now built the following solution with the ongoing support of their AWS Data Lab Architect.

The platform consists of an end-to-end data pipeline with three main components:

Data lake – As a central source of truth
Data warehouse – For analytics and reporting needs
Fast access layer – To serve near-real-time reports to merchants

We chose three different storage solutions:

Amazon Simple Storage Service (Amazon S3) for raw data landing and a curated data layer to build the foundation of the data lake
Amazon Redshift to create a federated data warehouse with conformed dimensions and star schemas for consumption by Microsoft Power BI, running on AWS
Aurora PostgreSQL to store all the data for near-real-time reporting as a fast access layer

In the following sections, we go into each component and supporting services in more detail.

Data lake

The first component of the data pipeline involved ingesting the data from an Amazon Managed Streaming for Apache Kafka (Amazon MSK) topic using Amazon MSK Connect to land the data into an S3 bucket (landing zone). The Epos Now team used the Confluent Amazon S3 sink connector to sink the data to Amazon S3. To make the sink process more resilient, Epos Now added the required configuration for dead-letter queues to redirect the bad messages to another topic. The following code is a sample configuration for a dead-letter queue in Amazon MSK Connect:

Because Epos Now was ingesting from multiple data sources, they used Airbyte to transfer the data to a landing zone in batches. A subsequent AWS Glue job reads the data from the landing bucket , performs data transformation, and moves the data to a curated zone of Amazon S3 in optimal format and layout. This curated layer then became the source of truth for all other use cases. Then Epos Now used an AWS Glue crawler to update the AWS Glue Data Catalog. This was augmented by the use of Amazon Athena for doing data analysis. To optimize for cost, Epos Now defined an optimal data retention policy on different layers of the data lake to save money as well as keep the dataset relevant.

Data warehouse

After the data lake foundation was established, Epos Now used a subsequent AWS Glue job to load the data from the S3 curated layer to Amazon Redshift. We used Amazon Redshift to make the data queryable in both Amazon Redshift (internal tables) and Amazon Redshift Spectrum. The team then used dbt as an extract, load, and transform (ELT) engine to create the target data model and store it in target tables and views for internal business intelligence reporting. The Epos Now team wanted to use their SQL knowledge to do all ELT operations in Amazon Redshift, so they chose dbt to perform all the joins, aggregations, and other transformations after the data was loaded into the staging tables in Amazon Redshift. Epos Now is currently using Power BI for reporting, which was migrated to the AWS Cloud and connected to Amazon Redshift clusters running inside Epos Now’s VPC.

Fast access layer

To build the fast access layer to deliver the metrics to Epos Now’s retail and hospitality merchants in near-real time, we decided to create a separate pipeline. This required developing a microservice running a Kafka consumer job to subscribe to the same Kafka topic in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster. The microservice received the messages, conducted the transformations, and wrote the data to a target data model hosted on Aurora PostgreSQL. This data was delivered to the UI layer through an API also hosted on Amazon EKS, exposed through Amazon API Gateway.

Outcome

The Epos Now team is currently building both the fast access layer and a centralized lakehouse architecture-based data platform on Amazon S3 and Amazon Redshift for advanced analytics use cases. The new data platform is best positioned to address scalability issues and support new use cases. The Epos Now team has also started offloading some of the real-time reporting requirements to the new target data model hosted in Aurora. The team has a clear strategy around the choice of different storage solutions for the right access patterns: Amazon S3 stores all the raw data, and Aurora hosts all the metrics to serve real-time and near-real-time reporting requirements. The Epos Now team will also enhance the overall solution by applying data retention policies in different layers of the data platform. This will address the platform cost without losing any historical datasets. The data model and structure (data partitioning, columnar file format) we designed greatly improved query performance and overall platform stability.

Conclusion

Epos Now revolutionized their data analytics capabilities, taking advantage of the breadth and depth of the AWS Cloud. They’re now able to serve insights to internal business users, and scale their data platform in a reliable, performant, and cost-effective manner.

The AWS Data Lab engagement enabled Epos Now to move from idea to proof of concept in 3 days using several previously unfamiliar AWS analytics services, including AWS Glue, Amazon MSK, Amazon Redshift, and Amazon API Gateway.

Epos Now is currently in the process of implementing the full data lake architecture, with a rollout to customers planned for late 2022. Once live, they will deliver on their strategic goal to provide real-time transactional data and put insights directly in the hands of their merchants.

About the Authors

Jason Downing is VP of Data and Insights at Epos Now. He is responsible for the Epos Now data platform and product direction. He specializes in product management across a range of industries, including POS systems, mobile money, payments, and eWallets.

Debadatta Mohapatra is an AWS Data Lab Architect. He has extensive experience across big data, data science, and IoT, across consulting and industrials. He is an advocate of cloud-native data platforms and the value they can drive for customers across industries.

A pathway to the cloud: Analysis of the Reserve Bank of New Zealand’s Guidance on Cyber Resilience

2022-07-18 Julian Busic

Post Syndicated from Julian Busic original https://aws.amazon.com/blogs/security/a-pathway-to-the-cloud-analysis-of-the-reserve-bank-of-new-zealands-guidance-on-cyber-resilience/

The Reserve Bank of New Zealand’s (RBNZ’s) Guidance on Cyber Resilience (referred to as “Guidance” in this post) acknowledges the benefits of RBNZ-regulated financial services companies in New Zealand (NZ) moving to the cloud, as long as this transition is managed prudently—in other words, as long as entities understand the risks involved and manage them appropriately. In this blog post, I analyze the RBNZ’s thinking as it developed the Guidance, and how the Guidance creates opportunities for NZ financial services customers to accelerate migration of workloads—including critical systems—to the Amazon Web Services (AWS) Cloud.

On page 14 of its Guidance, the RBNZ writes that “[i]f used prudently, third-party services may reduce an entity’s cyber risk, especially for those entities that lack cyber expertise.” This open regulatory stance towards the cloud enables our NZ financial services customers to consider a cloud first strategy for both new and existing systems, including critical workloads. Customers must, however, manage the transition to the cloud prudently, working closely with both their cloud service provider and their regulators.

This blog post is aimed at boards, management, and technology decision-makers, for whom understanding regulatory thinking is a useful input when developing an enterprise cloud strategy.

Operational technology staff and risk practitioners seeking detailed guidance on how AWS helps you align with the RBNZ’s Guidance can download our New Zealand Financial Services whitepaper from our public website and the AWS Reserve Bank of New Zealand Guidance on Cyber Resilience (RBNZ-GCR) Workbook from AWS Artifact, a self-service portal for you to access AWS compliance reports.

Overview and applicability

The RBNZ’s Guidance sets out the RBNZ’s expectations for management of cyber resilience. It’s aimed at all registered banks, licensed non-bank deposit takers, licensed insurers, and designated financial market infrastructures that are regulated by the RBNZ. The Guidance makes a series of non-binding recommendations across four domains—Governance, Capability Building, Information Sharing, and Third-Party Management.

Each section of the Guidance has a short preamble, summarizing the RBNZ’s expectations for effective risk management in each domain and providing insights into why the RBNZ is making specific recommendations.

The Guidance can be tailored to an entity’s individual needs, technology choices, and risk appetite. Boards, management, and technology decision-makers should familiarize themselves with the RBNZ’s Guidance, ascertain how closely their own organization aligns to it, and work to remediate any identified gaps.

Why non-binding guidance and not an enforceable standard?

The RBNZ gives several reasons (see RBNZ Summary of submissions, paragraphs 9-16) for choosing to publish non-binding recommendations rather than legally binding requirements. The RBNZ declares an intent to monitor adoption of its recommendations by industry, and indicates that future policy settings might include developing legally binding standards for cyber resilience. In this respect, the RBNZ’s approach is similar to that of the Australian Prudential Regulation Authority (APRA), which first issued non-binding guidance on management of IT security risk in 2013, before moving to a legally binding standard in 2019.

The RBNZ gives the following reasons for choosing guidance over a standard:

The RBNZ’s policy stance of being moderately active in respect to cyber resilience
A previous light-touch approach regarding cyber resilience
Providing sufficient time for industry to adjust to new policy settings, given the wide range of maturity within financial services organizations in New Zealand
The gap between New Zealand’s and other jurisdictions’ cyber readiness
The RBNZ’s current ability to effectively monitor and ensure compliance

The RBNZ indicates that it will “work together with the industry to operationalise the finalised Guidance” (RBNZ Summary of submissions, paragraph 10) and that it is “looking to strengthen [its] cyber resilience expertise in [its] financial stability function” although this will “take time to achieve” (RBNZ Summary of submissions, paragraph 9).

RBNZ-regulated entities should already be self-assessing against the Guidance and working to address gaps as a matter of priority. This is not just because the Guidance could become a legally binding standard in the next 3–5 years, but because the RBNZ has created a practical and flexible framework for the management of cyber risk, which will greatly enhance the NZ financial sector’s resilience to cyber incidents. Non–RBNZ-regulated entities looking for a benchmark to measure themselves against can also use the RBNZ’s Guidance to assess and improve the effectiveness of their own control environments.

Comparing rules-based frameworks and principles-based frameworks

There are two main ways that regulators communicate their risk management expectations to their regulated entities. These are a rules-based approach (sometimes called a compliance-based approach) and a principles-based approach. The RBNZ’s Guidance takes a principles-based approach towards the management of cyber risk.

With a rules-based approach, the regulator takes responsibility for identifying risks and lays out explicit and granular controls that regulated entities are required to implement. A rules-based approach is highly prescriptive, meaning that regulated entities can adopt a checklist approach in meeting their regulators’ requirements. This approach, although it gives certainty to regulated entities regarding the controls they are expected to adopt, can have disadvantages for regulators:

Creating and maintaining detailed technical rules can be challenging, given the pace at which technology and the threat environment evolve.
Regulators have a diverse population of regulated entities, so a rules-based approach can be inflexible or have blind spots.
A rules-based approach doesn’t encourage entities to actively identify and manage their own unique set of risks.

By contrast, a principles-based approach describes a set of desired regulatory or risk-management outcomes, but it isn’t prescriptive in how regulated entities achieve these goals. Regulators act in a vendor- and technology-neutral manner, and regulated entities are expected to interpret regulatory requirements or guidance in the context of their individual business models, technology choices, threat environments, and risk appetites.

Under a principles-based approach, an entity must be able to demonstrate to its regulators’ satisfaction that it both understands the current and emerging risks it faces, and that it is managing these risks appropriately. For example, the principle that entities “[…] should develop and maintain a programme for continuing cyber resilience training for staff at all levels” (Guidance, section A3.3 page 6) gives clear direction, but leaves it up to the entity to decide on the approach to take, and how the entity will demonstrate to the RBNZ that this principle is being met.

A principles-based approach avoids the issues with the rules-based approach that I outlined previously—this approach is significantly longer-lived than a rules-based approach, it moves responsibility for effective risk identification and management from the regulator to the entity (which better understands its own risk profile and appetite), and the framework can be applied to a regulated entity population that varies in size, nature, and complexity.

Freedom to innovate under a principles-based approach

The RBNZ says that its Guidance should be employed in a manner “[…] proportionate to the size, structure and operational environment of an entity, as well as the nature, scope, complexity and risk profile of its products and services” (Guidance, page 2).

You can therefore meet the RBNZ’s Guidance in many different ways, as long as you can demonstrate to the RBNZ that your organization understands the risks it is facing and is managing them appropriately. A principles-based approach creates opportunities for innovation, because there are many different ways to meet a set of regulatory principles.

If you are an NZ financial services customer who also operates in Australia, you might note that the RBNZ’s approach aligns to that of the principal financial services regulator in Australia—the Australian Prudential Regulation Authority (APRA). APRA also takes a principles-based approach to its prudential framework, “avoiding excessive prescription where possible to allow for the diversity of practice according to the size, business activity, and sophistication of the institutions being supervised” (APRA’s objectives, Chapter 1).

A cautious green light to the cloud for New Zealand financial services

“If used prudently, third-party services may reduce an entity’s cyber risk, especially for those entities that lack cyber expertise” (Guidance, page 14).

In my view, this statement represents a (cautious) green light for financial services customers in NZ who wish to migrate systems to the AWS Cloud, although as the RBNZ makes clear, you “should be fully aware of the cyber risk associated with third parties and act appropriately to mitigate that risk” (Guidance, page 14). The RBNZ also requests that for critical functions, entities “[…] should inform the Reserve Bank about their outsourcing of critical functions to cloud service providers early in their decision-making process” (Guidance, Section D8.1, page 17).

The RBNZ defines a critical function as “[a]ny activity, function, process, or service, the loss of which (for even a short period of time) would materially affect the continued operation of an entity, the market it serves and the broader financial system, and/or materially affect the data integrity, reputation of an entity and confidence in the financial system” (Guidance, page 19).

Although the RBNZ doesn’t elaborate further on why it requests early notification about outsourcing of critical functions to the cloud, it’s likely that early engagement is requested so that the RBNZ has the opportunity to provide early feedback on any areas of potential concern, before the initiative is significantly progressed and a large amount of resources are committed.

Migration of higher-risk workloads to the cloud will naturally attract higher levels of regulatory scrutiny, but this doesn’t change the RBNZ’s open regulatory stance on cloud security. This stance is further emphasized by the RBNZ’s comment that “If managed prudently, migrating to the cloud presents a number of benefits including geographically dispersed infrastructures, agility to scale more quickly, improved automation, sufficient redundancy, and reduced initial investment costs for individual financial institutions” (Guidance, page 15).

Building innovative, secure, and highly resilient solutions on AWS, and using the high levels of visibility that you have into your environments that are running on AWS, can help you demonstrate to your regulators how you are identifying and managing your cyber resilience risks in line with the RBNZ’s Guidance.

A note on regulatory myths

In conversations with customers, I occasionally encounter “regulatory myths,” such as “certain types of workloads are prohibited in the cloud,” or “my regulator won’t allow me to use multi-region architectures.”

To date, the RBNZ has not made specific recommendations or set specific requirements regarding technology solutions. This includes, but is not limited to, choice of vendors or technology platforms, prescription of particular architectures, or the types of workload that may or may not be migrated to the cloud. Remember, the RBNZ’s Guidance is a principles-based framework, and is vendor-, technology-, and solution-neutral.

We have many examples of financial services companies all over the world successfully running critical workloads in the AWS Cloud, but regulatory myths and misunderstandings can inhibit our customers’ ability to “think big” when developing their cloud strategies. If you believe that you must implement specific technical patterns to meet regulatory expectations, we encourage you to contact the RBNZ to discuss any aspects of the Guidance that require clarification. We also encourage you to contact your AWS account team, who can arrange support from internal AWS risk and regulatory specialists, particularly if critical systems are proposed for migration to AWS.

Conclusion

The RBNZ’s Guidance on Cyber Resilience is an important first step for financial services regulation of cybersecurity in NZ. The Guidance can be considered cloud friendly because it acknowledges that prudent use of third parties (such as AWS) can reduce cyber risk, especially for entities that lack cyber expertise, and outlines several benefits of the cloud over traditional on-premises infrastructure, including resilience and redundancy, ability to scale, and reduced initial investment costs.

The principles-based nature of the RBNZ’s Guidance creates opportunities for you to develop innovative solutions in the AWS Cloud, because there are many different ways to meet the principles contained in the RBNZ’s Guidance. The key consideration is that you demonstrate to your regulators that you both understand the cyber risks you face in moving to the AWS Cloud, and manage them appropriately.

The launch of the AWS Asia Pacific (Auckland) Region in 2024, our wide range of products and services, and the visibility that you have into the AWS control environment (through AWS Artifact) and your own environment (through services like Amazon GuardDuty and AWS Security Hub) can all help you demonstrate to the RBNZ that you are managing cyber risk in accordance with the RBNZ’s expectations.

Next steps

Boards, executives, and technology decision-makers should familiarize themselves with the RBNZ’s Guidance, and if they aren’t already doing so, conduct a self-assessment and initiate a body of work to address identified gaps.

In view of the RBNZ’s cautious green light for prudent migration to the cloud—including for critical systems—NZ financial services customers should review their existing cloud strategies and identify areas where they can both broaden and accelerate their cloud journeys. The AWS Cloud Adoption Framework (AWS CAF) offers guidance and best practices to help organizations develop an efficient and effective plan for their cloud adoption journey. The AWS C-suite Guide to Shared Responsibility for Cloud Security and Data Safe Cloud eBook inform boards and senior management about both the benefits and risks of operating in the cloud.

Operational technology staff and risk practitioners can download our New Zealand Financial Service whitepaper from our public website and the AWS Reserve Bank of New Zealand Guidance on Cyber Resilience (RBNZ-GCR) Workbook from AWS Artifact. The RBNZ-GCR is particularly useful for operational IT staff and risk practitioners because it provides prescriptive guidance on which controls to implement on your side of the shared responsibility model and which AWS controls you inherit from the service.

Finally, contact your AWS representative to discuss how the AWS Partner Network, AWS solution architects, AWS Professional Services teams, and AWS Training and Certification can assist with your cloud adoption journey. If you don’t have an AWS representative, contact us at https://aws.amazon.com/contact-us.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

OSPAR 2022 report now available with 142 services in scope

2022-07-07 Joseph Goh

Post Syndicated from Joseph Goh original https://aws.amazon.com/blogs/security/ospar-2022-report-now-available-with-142-services-in-scope/

We’re excited to announce the completion of our annual Outsourced Service Provider’s Audit Report (OSPAR) audit cycle on July 1, 2022. The 2022 OSPAR certification cycle includes the addition of 15 new services in scope, bringing the total number of services in scope to 142 in the AWS Asia Pacific (Singapore) Region.

Newly added services in scope include the following:

Successful completion of the OSPAR assessment demonstrates that AWS has a system of controls in place that meet the Association of Banks in Singapore (ABS) Guidelines on Control Objectives and Procedures for Outsourced Service Providers. Our alignment with the ABS guidelines demonstrates our commitment to meet the security expectations for cloud service providers set by the financial services industry in Singapore. Customers can use the OSPAR assessment to conduct due diligence, and to help reduce the effort and costs required for compliance. An independent third-party auditor, selected from the ABS list of approved auditors, performs the OSPAR assessment.

You can download the latest OSPAR report from AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact. The list of services in scope for OSPAR is available in the report, and is also available on the AWS Services in Scope by Compliance Program webpage.

As always, we’re committed to bringing new services into the scope of our OSPAR program based on your architectural and regulatory needs. If you have questions about the OSPAR report, contact your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

New AWS whitepaper: AWS User Guide to Financial Services Regulations and Guidelines in New Zealand

2022-06-21 Julian Busic

Post Syndicated from Julian Busic original https://aws.amazon.com/blogs/security/new-aws-whitepaper-aws-user-guide-to-financial-services-regulations-and-guidelines-in-new-zealand/

Amazon Web Services (AWS) has released a new whitepaper to help financial services customers in New Zealand accelerate their use of the AWS Cloud.

The new AWS User Guide to Financial Services Regulations and Guidelines in New Zealand—along with the existing AWS Workbook for the RBNZ’s Guidance on Cyber Resilience—continues our efforts to help AWS customers navigate the regulatory expectations of the Reserve Bank of New Zealand (RBNZ) in a shared responsibility environment.

This whitepaper is intended for RBNZ-regulated institutions that are looking to run material workloads in the AWS Cloud, and is particularly useful for leadership, security, risk, and compliance teams that need to understand RBNZ requirements and guidance.

The whitepaper summarizes RBNZ requirements and guidance related to outsourcing, cyber resilience, and the cloud. It also gives RBNZ-regulated institutions information they can use to commence their due diligence and assess how to implement the appropriate programs for their use of AWS cloud services.

This document joins existing guides for other jurisdictions in the Asia Pacific region, such as Australia, India, Singapore, and Hong Kong. As the regulatory environment continues to evolve, we’ll provide further updates on the AWS Security Blog and the AWS Compliance page. You can find more information on cloud-related regulatory compliance at the AWS Compliance Center. You can also reach out to your AWS account manager for help finding the resources you need.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Monitoring and alerting break-glass access in an AWS Organization

2022-05-27 Haresh Nandwani

Post Syndicated from Haresh Nandwani original https://aws.amazon.com/blogs/architecture/monitoring-and-alerting-break-glass-access-in-an-aws-organization/

Organizations building enterprise-scale systems require the setup of a secure and governed landing zone to deploy and operate their systems. A landing zone is a starting point from which your organization can quickly launch and deploy workloads and applications with confidence in your security and infrastructure environment as described in What is a landing zone?. Nationwide Building Society (Nationwide) is the world’s largest building society. It is owned by its 16 million members and exists to serve their needs. The Society is one of the UK’s largest providers for mortgages, savings and current accounts, as well as being a major provider of ISAs, credit cards, personal loans, insurance, and investments.

For one of its business initiatives, Nationwide utilizes AWS Control Tower to build and operate their landing zone which provides a well-established pattern to set up and govern a secure, multi-account AWS environment. Nationwide operates in a highly regulated industry and our governance assurance requires adequate control of any privileged access to production line-of-business data or to resources which have access to them. We chose for this specific business initiative to deploy our landing zone using AWS Organizations, to benefit from ongoing account management and governance as aligned with AWS implementation best practices. We also utilized AWS Single Sign-On (AWS SSO) to create our workforce identities in AWS once and manage access centrally across our AWS Organization. In this blog, we describe the integrations required across AWS Control Tower and AWS SSO to implement a break-glass mechanism that makes access reporting publishable to system operators as well as to internal audit systems and processes. We will outline how we used AWS SSO for our setup as well as the three architecture options we considered, and why we went with the chosen solution.

Sourcing AWS SSO access data for near real-time monitoring

In our setup, we have multiple AWS Accounts and multiple trails on each of these accounts. Users will regularly navigate across multiple accounts as they operate our infrastructure, and their journeys are marked across these multiple trails. Typically, AWS CloudTrail would be our chosen resource to clearly and unambiguously identify account or data access. The key challenge in this scenario was to design an efficient and cost-effective solution to scan these trails to help identify and report on break-glass user access to account and production data. To address this challenge, we developed the following two architecture design options.

Option 1: A decentralized approach that uses AWS CloudFormation StackSets, Amazon EventBridge and AWS Lambda

Our solution entailed a decentralized approach by deploying a CloudFormation StackSet to create, update, or delete stacks across multiple accounts and AWS Regions with a single operation. The Stackset created Amazon EventBridge rules and target AWS Lambda functions. These functions post to EventBridge in our audit account. Our audit account has a set of Lambda functions running off EventBridge to initiate specific events, format the event message and post to Slack, our centralized communication platform for this implementation. Figure 1 depicts the overall architecture for this option.

Figure 1. De-centralized logging using Amazon EventBridge and AWS Lambda

Option 2: Use an organization trail in the Organization Management account

This option uses the centralized organization trail in the Organization Management account to source audit data. Details of how to create an organization trail can be found in the AWS CloudTrail User Guide. CloudTrail was configured to send log events to CloudWatch Logs. These events are then sent via Lambda functions to Slack using webhooks. We used a public terraform module in this GitHub repository to build this Lambda Slack integration. Figure 2 depicts the overall architecture for this option.

Figure 2. Centralized logging pattern using Amazon CloudWatch

This was our preferred option and is the one we finally implemented.

We also evaluated a third option which was to use centralized logging and auditing feature enabled by Control Tower. Users authenticate and federate to target accounts from a central location so it seemed possible to source this info from the centralized logs. These log events arrive as .gz compressed json objects, which meant having to expand these archives repeatedly for inspection. We therefore decided against this option.

A centralized, economic, extensible solution to alert of SSO break-glass

Our requirement was to identify break-glass access across any of the access mechanisms supported by AWS, including CLI and User Portal access. To ensure we have comprehensive coverage across all access mechanisms, we identified all the events initiated for each access mechanism:

User Portal/AWS Console access events
- Authenticate
- ListApplications
- ListApplicationProfiles
- Federate – this event contains the role that the user is federating into
CLI access events
- CreateToken
- ListAccounts
- ListAccountRoles
- GetRoleCredentials – this event contains the role that the user is federating into

EventBridge is able to initiate actions after events only when the event is trying to perform changes (when the “readOnly” attribute on the event record body equals “false”).

The AWS support team was aware of this attribute and recommended that we, change the data flow we were using to one able to initiate actions after any kind of event, regardless of the value on its readOnly attribute. The solution in our case was to send the CloudTrail logs to CloudWatch Logs. This then and initiates the Lambda function through a filter subscription that detects the desired event names on the log content.

The filter used is as follows:

{($.eventSource = sso.amazonaws.com) && ($.eventName = Federate||$.eventName = GetRoleCredentials)}

Due to the query size in the CloudWatch Log queries we had to remove the subscription filters and do the parsing of the content of the log lines inside the lambda function. In order to determine what accounts would initiate the notifications, we sent the list of accounts and roles to it as an environment variable at runtime.

Considerations with cross-account SSO access

With direct federation users get an access token. This is most obvious in AWS single sign on at the chiclet page as “Command line or programmatic access”. SSO tokens have a limited lifetime (we use the default 1-hour). A user does not have to get a new token to access a target resource until the one they are using is expired. This means that a user may repeatedly access a target account using the same token during its lifetime. Although the token is made available at the chiclet page, the GetRoleCredentials event does not occur until it is used to authenticate an API call to the target AWS account.

Conclusion

In this blog, we discussed how AWS Control Tower and AWS Single Sign-on enabled Nationwide to build and govern a secure, multi-account AWS environment for one of their business initiatives and centralize access management across our implementation. The integration was important for us to accurately and comprehensively identify and audit break-glass access for our implementation. As a result, we were able to satisfy our security and compliance audit requirements for privileged access to our AWS accounts.

AWS User Guide to Financial Services Regulations and Guidelines in Switzerland and FINMA workbooks publications

2022-02-17 Margo Cronin

Post Syndicated from Margo Cronin original https://aws.amazon.com/blogs/security/aws-user-guide-to-financial-services-regulations-and-guidelines-in-switzerland-and-finma/

AWS is pleased to announce the publication of the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland whitepaper and workbooks.

This guide refers to certain rules applicable to financial institutions in Switzerland, including banks, insurance companies, stock exchanges, securities dealers, portfolio managers, trustees and other financial entities which are overseen (directly or indirectly) by the Swiss Financial Market Supervisory Authority (FINMA).

Amongst other topics, this guide covers requirements created by the following regulations and publications of interest to Swiss financial institutions:

Federal Laws – including Article 47 of the Swiss Banking Act (BA). Banks and Savings Banks are overseen by FINMA and governed by the BA (Bundesgesetz über die Banken und Sparkassen, Bankengesetz, BankG). Article 47 BA holds relevance in the context of outsourcing.
Response on Cloud Guidelines for Swiss Financial institutions produced by the Swiss Banking Union, Schweizerische Bankiervereinigung SBVg.
Controls outlined by FINMA, Switzerland’s independent regulator of financial markets, that may be applicable to Swiss banks and insurers in the context of outsourcing arrangements to the cloud.

In combination with the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland whitepaper, customers can use the detailed AWS FINMA workbooks and ISAE 3000 report available from AWS Artifact.

The five core FINMA circulars are intended to assist Swiss-regulated financial institutions in understanding approaches to due diligence, third-party management, and key technical and organizational controls that should be implemented in cloud outsourcing arrangements, particularly for material workloads. The AWS FINMA workbooks and ISAE 3000 report scope covers, in detail, requirements of the following FINMA circulars:

2018/03 Outsourcing – banks and insurers (04.11.2020)
2008/21 Operational Risks – Banks – Principle 4 Technology Infrastructure (31.10.2019)
2008/21 Operational Risks – Banks – Appendix 3 Handling of electronic Client Identifying Data (31.10.2019)
2013/03 Auditing – Information Technology (04.11.2020)
2008/10 Self-regulation as a minimum standard – Minimum Business Continuity Management (BCM) minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

Customers can use the detailed FINMA workbooks, which include detailed control mappings for each FINMA control, covering both the AWS control activities and the Customer User Entity Controls. Where applicable, under the AWS Shared Responsibility Model, these workbooks provide industry standard practices, incorporating AWS Well-Architected, to assist Swiss customers in their own preparation for FINMA circular alignment.

This whitepaper follows the issuance of the second Swiss Financial Market Supervisory Authority (FINMA) ISAE 3000 Type 2 attestation report. The latest report covers the period from October 1, 2020 to September 30, 2021, with a total of 141 AWS services and 23 global AWS Regions included in the scope. Customers can download the report from AWS Artifact. A full list of certified services and Regions is presented within the published FINMA report.

As always, AWS is committed to bringing new services into the scope of our FINMA program in the future based on customers’ architectural and regulatory needs. Please reach out to your AWS account team if you have any questions or feedback. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Doing more with less: Moving from transactional to stateful batch processing

2022-02-09 Tom Jin

Post Syndicated from Tom Jin original https://aws.amazon.com/blogs/big-data/doing-more-with-less-moving-from-transactional-to-stateful-batch-processing/

Amazon processes hundreds of millions of financial transactions each day, including accounts receivable, accounts payable, royalties, amortizations, and remittances, from over a hundred different business entities. All of this data is sent to the eCommerce Financial Integration (eCFI) systems, where they are recorded in the subledger.

Ensuring complete financial reconciliation at this scale is critical to day-to-day accounting operations. With transaction volumes exhibiting double-digit percentage growth each year, we found that our legacy transactional-based financial reconciliation architecture proved too expensive to scale and lacked the right level of visibility for our operational needs.

In this post, we show you how we migrated to a batch processing system, built on AWS, that consumes time-bounded batches of events. This not only reduced costs by almost 90%, but also improved visibility into our end-to-end processing flow. The code used for this post is available on GitHub.

Legacy architecture

Our legacy architecture primarily utilized Amazon Elastic Compute Cloud (Amazon EC2) to group related financial events into stateful artifacts. However, a stateful artifact could refer to any persistent artifact, such as a database entry or an Amazon Simple Storage Service (Amazon S3) object.

We found this approach resulted in deficiencies in the following areas:

Cost – Individually storing hundreds of millions of financial events per day in Amazon S3 resulted in high I/O and Amazon EC2 compute resource costs.
Data completeness – Different events flowed through the system at different speeds. For instance, while a small stateful artifact for a single customer order could be recorded in a couple of seconds, the stateful artifact for a bulk shipment containing a million lines might require several hours to update fully. This made it difficult to know whether all the data had been processed for a given time range.
Complex retry mechanisms – Financial events were passed between legacy systems using individual network calls, wrapped in a backoff retry strategy. Still, network timeouts, throttling, or traffic spikes could result in some events erroring out. This required us to build a separate service to sideline, manage, and retry problematic events at a later date.
Scalability – Bottlenecks occurred when different events competed to update the same stateful artifact. This resulted in excessive retries or redundant updates, making it less cost-effective as the system grew.
Operational support – Using dedicated EC2 instances meant that we needed to take valuable development time to manage OS patching, handle host failures, and schedule deployments.

The following diagram illustrates our legacy architecture.

Transactional-based legacy architecture

Evolution is key

Our new architecture needed to address the deficiencies while preserving the core goal of our service: update stateful artifacts based on incoming financial events. In our case, a stateful artifact refers to a group of related financial transactions used for reconciliation. We considered the following as part of the evolution of our stack:

Stateless and stateful separation
Minimized end-to-end latency
Scalability

Stateless and stateful separation

In our transactional system, each ingested event results in an update to a stateful artifact. This became a problem when thousands of events came in all at once for the same stateful artifact.

However, by ingesting batches of data, we had the opportunity to create separate stateless and stateful processing components. The stateless component performs an initial reduce operation on the input batch to group together related events. This meant that the rest of our system could operate on these smaller stateless artifacts and perform fewer write operations (fewer operations means lower costs).

The stateful component would then join these stateless artifacts with existing stateful artifacts to produce an updated stateful artifact.

As an example, imagine an online retailer suddenly received thousands of purchases for a popular item. Instead of updating an item database entry thousands of times, we can first produce a single stateless artifact that summaries the latest purchases. The item entry can now be updated one time with the stateless artifact, reducing the update bottleneck. The following diagram illustrates this process.

Batch visualization

Minimized end-to-end latency

Unlike traditional extract, transform, and load (ETL) jobs, we didn’t want to perform daily or even hourly extracts. Our accountants need to be able to access the updated stateful artifacts within minutes of data arriving in our system. For instance, if they had manually sent a correction line, they wanted to be able to check within the same hour that their adjustment had the intended effect on the targeted stateful artifact instead of waiting until the next day. As such, we focused on parallelizing the incoming batches of data as much as possible by breaking down the individual tasks of the stateful component into subcomponents. Each subcomponent could run independently of each other, which allowed us to process multiple batches in an assembly line format.

Scalability

Both the stateless and stateful components needed to respond to shifting traffic patterns and possible input batch backlogs. We also wanted to incorporate serverless compute to better respond to scale while reducing the overhead of maintaining an instance fleet.

This meant we couldn’t simply have a one-to-one mapping between the input batch and stateless artifact. Instead, we built flexibility into our service so the stateless component could automatically detect a backlog of input batches and group multiple input batches together in one job. Similar backlog management logic was applied to the stateful component. The following diagram illustrates this process.

Batch scalability

Current architecture

To meet our needs, we combined multiple AWS products:

AWS Step Functions – Orchestration of our stateless and stateful workflows
Amazon EMR – Apache Spark operations on our stateless and stateful artifacts
AWS Lambda – Stateful artifact indexing and orchestration backlog management
Amazon ElastiCache – Optimizing Amazon S3 request latency
Amazon S3 – Scalable storage of our stateless and stateful artifacts
Amazon DynamoDB – Stateless and stateful artifact index

The following diagram illustrates our current architecture.

Current architecture

The following diagram shows our stateless and stateful workflow.

Flowchart

The AWS CloudFormation template to render this architecture and corresponding Java code is available in the following GitHub repo.

Stateless workflow

We used an Apache Spark application on a long-running Amazon EMR cluster to simultaneously ingest input batch data and perform reduce operations to produce the stateless artifacts and a corresponding index file for the stateful processing to use.

We chose Amazon EMR for its proven highly available data-processing capability in a production setting and also its ability to horizontally scale when we see increased traffic loads. Most importantly, Amazon EMR had lower cost and better operational support when compared to a self-managed cluster.

Stateful workflow

Each stateful workflow performs operations to create or update millions of stateful artifacts using the stateless artifacts. Similar to the stateless workflows, all stateful artifacts are stored in Amazon S3 across a handful of Apache Spark part-files. This alone resulted in a huge cost reduction, because we significantly reduced the number of Amazon S3 writes (while using the same amount of overall storage). For instance, storing 10 million individual artifacts using the transactional legacy architecture would cost $50 in PUT requests alone, whereas 10 Apache Spark part-files would cost only $0.00005 in PUT requests (based on $0.005 per 1,000 requests).

However, we still needed a way to retrieve individual stateful artifacts, because any stateful artifact could be updated at any point in the future. To do this, we turned to DynamoDB. DynamoDB is a fully managed and scalable key-value and document database. It’s ideal for our access pattern because we wanted to index the location of each stateful artifact in the stateful output file using its unique identifier as a primary key. We used DynamoDB to index the location of each stateful artifact within the stateful output file. For instance, if our artifact represented orders, we would use the order ID (which has high cardinality) as the partition key, and store the file location, byte offset, and byte length of each order as separate attributes. By passing the byte-range in Amazon S3 GET requests, we can now fetch individual stateful artifacts as if they were stored independently. We were less concerned about optimizing the number of Amazon S3 GET requests because the GET requests are over 10 times cheaper than PUT requests.

Overall, this stateful logic was split across three serial subcomponents, which meant that three separate stateful workflows could be operating at any given time.

Pre-fetcher

The following diagram illustrates our pre-fetcher subcomponent.

Prefetcher architecture

The pre-fetcher subcomponent uses the stateless index file to retrieve pre-existing stateful artifacts that should be updated. These might be previous shipments for the same customer order, or past inventory movements for the same warehouse. For this, we turn once again to Amazon EMR to perform this high-throughput fetch operation.

Each fetch required a DynamoDB lookup and an Amazon S3 GET partial byte-range request. Due to the large number of external calls, fetches were highly parallelized using a thread pool contained within an Apache Spark flatMap operation. Pre-fetched stateful artifacts were consolidated into an output file that was later used as input to the stateful processing engine.

Stateful processing engine

The following diagram illustrates the stateful processing engine.

Stateful processor architecture

The stateful processing engine subcomponent joins the pre-fetched stateful artifacts with the stateless artifacts to produce updated stateful artifacts after applying custom business logic. The updated stateful artifacts are written out across multiple Apache Spark part-files.

Because stateful artifacts could have been indexed at the same time that they were pre-fetched (also called in-flight updates), the stateful processor also joins recently processed Apache Spark part-files.

We again used Amazon EMR here to take advantage of the Apache Spark operations that are required to join the stateless and stateful artifacts.

State indexer

The following diagram illustrates the state indexer.

State Indexer architecture

This Lambda-based subcomponent records the location of each stateful artifact within the stateful part-file in DynamoDB. The state indexer also caches the stateful artifacts in an Amazon ElastiCache for Redis cluster to provide a performance boost in the Amazon S3 GET requests performed by the pre-fetcher.

However, even with a thread pool, a single Lambda function isn’t powerful enough to index millions of stateful artifacts within the 15-minute time limit. Instead, we employ a cluster of Lambda functions. The state indexer begins with a single coordinator Lambda function, which determines the number of worker functions that are needed. For instance, if 100 part-files are generated by the stateful processing engine, then the coordinator might assign five part-files for each of the 20 Lambda worker functions to work on. This method is highly scalable because we can dynamically assign more or fewer Lambda workers as required.

Each Lambda worker then performs the ElastiCache and DynamoDB writes for all the stateful artifacts within each assigned part-file in a multi-threaded manner. The coordinator function monitors the health of each Lambda worker and restarts workers as needed.

Distributed Lambda architecture

Orchestration

We used Step Functions to coordinate each of the stateless and stateful workflows, as shown in the following diagram.

Step Function Workflow

Every time a new workflow step ran, the step was recorded in a DynamoDB table via a Lambda function. This table not only maintained the order in which stateful batches should be run, but it also formed the basis of the backlog management system, which directed the stateless ingestion engine to group more or fewer input batches together depending on the backlog.

We chose Step Functions for its native integration with many AWS services (including triggering by an Amazon CloudWatch scheduled event rule and adding Amazon EMR steps) and its built-in support for backoff retries and complex state machine logic. For instance, we defined different backoff retry rates based on the type of error.

Conclusion

Our batch-based architecture helped us overcome the transactional processing limitations we originally set out to resolve:

Reduced cost – We have been able to scale to thousands of workflows and hundreds of million events per day using only three or four core nodes per EMR cluster. This reduced our Amazon EC2 usage by over 90% when compared with a similar transactional system. Additionally, writing out batches instead of individual transactions reduced the number of Amazon S3 PUT requests by over 99.8%.
Data completeness guarantees – Because each input batch is associated with a time interval, when a batch has finished processing, we know that all events in that time interval have been completed.
Simplified retry mechanisms – Batch processing means that failures occur at the batch level and can be retried directly through the workflow. Because there are far fewer batches than transactions, batch retries are much more manageable. For instance, in our service, a typical batch contains about two million entries. During a service outage, only a single batch needs to be retried, as opposed to two million individual entries in the legacy architecture.
High scalability – We’ve been impressed with how easy it is to scale our EMR clusters on the fly if we detect an increase in traffic. Using Amazon EMR instance fleets also helps us automatically choose the most cost-effective instances across different Availability Zones. We also like the performance achieved by our Lambda-based state indexer. This subcomponent not only dynamically scales with no human intervention, but has also been surprisingly cost-efficient. A large portion of our usage has fallen within the free tier.
Operational excellence – Replacing traditional hosts with serverless components such as Lambda allowed us to spend less time on compliance tickets and focus more on delivering features for our customers.

We are particularly excited about the investments we have made moving from a transactional-based system to a batch processing system, especially our shift from using Amazon EC2 to using serverless Lambda and big data Amazon EMR services. This experience demonstrates that even services originally built on AWS can still achieve cost reductions and improve performance by rethinking how AWS services are used.

Inspired by our progress, our team is moving to replace many other legacy services with serverless components. Likewise, we hope that other engineering teams can learn from our experience, continue to innovate, and do more with less.

Find the code used for this post in the following GitHub repository.

Special thanks to development team: Ryan Schwartz, Abhishek Sahay, Cecilia Cho, Godot Bian, Sam Lam, Jean-Christophe Libbrecht, and Nicholas Leong.

About the Authors

Tom Jin is a Senior Software Engineer for eCommerce Financial Integration (eCFI) at Amazon. His interests include building large-scale systems and applying machine learning to healthcare applications. He is based in Vancouver, Canada and is a fan of ocean conservation.

Karthik Odapally is a Senior Solutions Architect at AWS supporting our Gaming Customers. He loves presenting at external conferences like AWS Re:Invent, and helping customers learn about AWS. His passion outside of work is to bake cookies and bread for family and friends here in the PNW. In his spare time, he plays Legend of Zelda (Link’s Awakening) with his 4 yr old daughter.

Financial Crime Discovery using Amazon EKS and Graph Databases

2022-02-01 Severin Gassauer-Fleissner

Post Syndicated from Severin Gassauer-Fleissner original https://aws.amazon.com/blogs/architecture/financial-crime-discovery-using-amazon-eks-and-graph-databases/

Discovering and solving financial crimes has become a challenge due to an increasing amount of financial data. While storing transactional payment data in a structured table format is useful for searching, filtering, and calculations, it is not always an ideal way to represent transactional data. For example, determining if there is a suspicious financial relationship between entity A and entity B is difficult to visualize in a table. Using tables, we would have to do SQL joins for every possible transaction from entity A to every possible subsequent transaction. We would then have to iterate this process until we found a relationship to entity B. Moreover, certain queries are challenging to run on a relational database management system (RDBMS). For example, it can be quite time consuming to discover which account received a minimum amount of $10,000,000 from other accounts.

Graph databases such as Amazon Neptune can be helpful with performing queries, because they can traverse the data and perform calculations simultaneously. Graphs enable us to represent transactions and parties over a multi-connected network, and discover patterns and chains of connections. It is common to use them in anti-money laundering (AML) applications, as they can help find patterns of suspicious transactions.

We needed a solution that could scale and process millions of transactions, by effectively using high memory and CPU configurations to perform complex queries quickly. As part of our customer demonstration to show how graph databases can help discover financial crimes, we also sourced a large dataset on which to test the solution. We used a graph database, Amazon Elastic Kubernetes Service (EKS), and Amazon Neptune, to search for suspicious financial chains across large amounts of transactional data in minutes.

Overview of our conceptual financial crime discovery solution

Figure 1. Workflow for financial crime discovery

We first needed to find a rules engine that could perform transaction inferencing and reasoning. It had to be able to process various rules on our data, be efficient, and able to scale. Next, we needed a straightforward way to ingest data into the solution. Once we had the data available, we needed to initiate a task to begin our inference job. Finally, we needed a location to store the result for further analysis and persistence, see Figure 1.

Using the AWS Cloud to scale up a graph database

We used multiple AWS services to create a fully automated end-to-end batch-based transaction process, shown in Figure 2. We used RDFox, which is an AWS Marketplace product, created by Oxford Semantic Technologies. RDFox is a high-performance in-memory graph database and semantic reasoner. To orchestrate RDFox, we used Amazon EKS Autoscaler to spin up a cluster to instantiate the RDFox container. Amazon EKS can spin up multiple containers for difference inference jobs and recycle the resources when the job finishes. We also used Amazon Neptune, an Amazon managed distributed graph database that can store up to 64 TiB of the results for diagnosis and long-term retention.

The data is stored in Amazon S3 buckets, which provide a streamlined way to feed a large dataset for processing.

Figure 2. Architecture diagram for financial crime discovery

Financial crimes rules

The power of graphs can help discover financial crimes that are reflected in relationships and monetary transactions. To demonstrate this, we will write two rules to detect two scenarios:

Given two suspicious parties X and Y, find out if there is a transactional relationship between them, and if so, provide the chain that connects them. This is a common scenario that financial institutions must detect.
Given a chain identifying suspicious behavior, find out if the minimum transaction amount that reached the beneficiary exceeds a threshold of Z dollars.

Generating data

To generate data for testing, we used a synthetic data generator written in Python (see GitHub repo in References). The generator built two sets of graph artifacts – parties and transactions. Every transaction is being paired with two random parties, and this iterative process creates a network of connected transactions and parties.

We created a dataset with a small percentage (0.01%) of parties tagged as “Suspicious Party,” to simulate the preceding business scenario. Note that those parties will have transactions going both in and out. In some cases, this will collide with other suspicious parties and establish a chain. This method enables us to get simulated data without engineering the suspicious chains.
The test dataset used with this solution comprises 1M transactions and thousands of parties.
For more information on generating data, see GitHub: Transaction Chains Data Generator.

Walkthrough of the financial investigator workflow

Once deployed, this solution can assist investigators as follows:

An investigator places the transactions and party data (nt triples) in a subdirectory within the input bucket. Typically, subdirectories can be named as a date or range of dates. In addition, the investigator uploads the particular rules (dlog files) and queries (rq files) that must be processed on the data.
Once the data is ready, the investigator uploads a job spec file (simple JSON, see References section). This contains the description of what resources the job requires (CPUs and memory), along with other configurations for the job.
Once the job spec has been uploaded to the bucket’s subdirectory, the job is automatically initiated. The Kubernetes scheduler will allocate enough resources to initiate the RDFox pod. The containers in the pod will then load, process, query results, and upload them to the output bucket.
Once the data reaches the output bucket, an AWS Lambda function is initiated. This invokes the Amazon Neptune Bulk Loader, which asynchronously loads the results to the Neptune cluster in a parallelized manner.
Once the load completes, the investigator gets an email notification that the job has been completed, and the results are ready for view.

Additionally:

The investigator can upload multiple rules and queries, they will all be processed automatically.
The investigator can launch multiple jobs with different/same data, and with different rules and queries at the same time.
All jobs outputs are saved in a unique job ID subdirectory in the output bucket.

To create the solution in your account, follow the instruction here: GitHub repo

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account
RDFox license, request a free trial
Terraform
Familiarity with Kubernetes, Amazon Neptune, and SPARQL queries

Rules

We create two materialization rule sets to fetch the two scenarios described.

1. detect-suspicious-parties-pair.dlog

The purpose of this first set of rules is to detect chains that might exist between two suspicious parties. The idea of these chains is to represent all the possible relationships that contain monetary transfers between a suspicious originator and the beneficiary. This will include non-suspicious parties in the chain. The rule tags these chains with the “SuspiciousChain” flag.

2. detect-chains-exceed-100-dollars.dlog

This set of rules is designed to tag the chains identified by the first set of rules. It also contains a minimum amount of $100 passed to the beneficiary. We can change the amount to check for different compliance requirements. We tag those chains as “HighValueChain.”

Run the job and check your results

Now we can run our job, with the given data, rules, and two additional queries (to extract “SuspiciousChain” and “HighValueChain” respectively). The result of the queries will be loaded to Amazon Neptune automatically for persistent storage, and is made durably available for further analysis.

Let’s look at the results. The following query can be initiated against RDFox console or Amazon Neptune.

Visualization

PREFIX : <http://oxfordsemantic.tech/transactions/entities#>

PREFIX prop: <http://oxfordsemantic.tech/transactions/properties#>

PREFIX type: <http://oxfordsemantic.tech/transactions/classes#>

PREFIX tt: <http://oxfordsemantic.tech/transactions/tupletables#>

SELECT ?S ?P ?O WHERE {

?S a type:SuspiciousChain .

?S ?P ?O .

}

Figure 3. Visualizing suspicious chains

Whoa! Figure 3 might look complicated at first, but this is because we are visualizing every pair of suspicious parties that have a relationship with another suspicious party. Let’s filter the query to look only at a single particular chain, which exceeds a minimum of $100 to the beneficiary. The following query can be executed against RDFox console or Amazon Neptune.

PREFIX : <http://oxfordsemantic.tech/transactions/entities#>
PREFIX prop: <http://oxfordsemantic.tech/transactions/properties#>
PREFIX type: <http://oxfordsemantic.tech/transactions/classes#>
PREFIX tt: <http://oxfordsemantic.tech/transactions/tupletables#>

SELECT * WHERE {
?S ?P ?O
{
SELECT ?S WHERE {
?S a type:HighValueChain .

} Limit 1
}
}

Figure 4. Visualizing a particular chain

In Figure 4, we can see that Allison, the suspicious originator of the chain, has sent a transaction to Troy. Troy, who is not suspicious, sent the transaction to Karina. Karina is the suspicious beneficiary. We can also see additional information, such as the transaction amount that Karina received, and the chain length of 2 in this case.

In our testing, we were able to scale up to 500M transactions with 50M parties and process this in less than two hours! And we performed this at a significant lower cost when compared to running constant, fixed similar hardware.

Cleaning up

Follow Terraform cleanup instructions.

Conclusion

Graph databases are a powerful tool to apply reasoning on complex financial relationships. The combination of Amazon Web Services and the RDFox engine results in an automated, scalable, and cost-effective, thanks to the dynamic Kubernetes Cluster Autoscaler. Customers can use this solution and provide their investigators with a tool they can experiment and reason on financial transactions. This solution simplifies the process, and makes it easier to try different rules and queries on complex large data collections.

This blog post is written with Oxford Semantic.

References

Solution:

Solution GitHub repository
Data generator repository
Low-level design documents and architecture
Examples for submitting a job