Tag Archives: AWS

How to Migrate Your SMS Program to Amazon Pinpoint

Post Syndicated from Tyler Holmes original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-migrate-your-sms-program-to-amazon-pinpoint/

How to Migrate Your SMS Program to Amazon Pinpoint

In the fast-paced realm of communication, where every second counts and attention spans are shorter than ever, the choice of channels that you use to deliver your message to your recipients is critical. While we often find ourselves swept away by the allure of flashy social media platforms and sleek email interfaces, it’s the unassuming text message, or SMS, that continually proves to be one of the most effective options. According to Statista, there over 5 billion mobile internet users globally, amounting to over 60% of the earth’s population of ~8 billion. SMS obviously provides an expansive reach that can help businesses connect with a diverse audience but in order to do that at scale, you need to use a service like Amazon Pinpoint that facilitates the ability to send SMS to over 240 countries and/or regions around the world. If you have a current SMS provider and are considering Pinpoint SMS for its global reach, scalability, cost effective pricing, and demonstrably high deliverability, this guide will walk you through how to migrate from your current provider.

There are several common reasons our customers give us when considering a migration. Don’t worry if your situation doesn’t fit into a neat box, we help customers navigate the dynamic landscape of SMS that is constantly evolving. Let’s dive deep into each of the below to highlight some common things we hear from our customers.

  • My current provider doesn’t deliver to countries I want to send to
  • My current provider is more expensive than Pinpoint pricing
    • Our pricing is available on the public pricing page here. Each country has it’s own cost associated with it so enter in the countries you would like to see pricing for. These prices are per message sent so if you are planning on sending to multiple countries factor in the types of messages that you will want to send as well as the countries. If your use case includes 2 way communication make sure to factor the number of inbound messages you expect into your calculations.
      • NOTE: Depending on the language the available characters per message varies, which can affect your calculations on cost. See here for an explanation
  • My current provider doesn’t have features that Pinpoint has
    • Among many other features Pinpoint has the ability to send over multiple channels, including: SMS, Email, Push/In-App, Voice, Over the Top (OTT) services such as WhatsApp, as well as interact with third-party APIs giving you the flexibility to send to many other channels.
  • My current provider is not native to AWS
    • Pinpoint, being native to the AWS Cloud, boasts the capability to seamlessly integrate with a wide array of services, including AI/ML offerings such as Amazon Personalize, Amazon Bedrock, and Amazon SageMaker, among others. This means you can leverage various AWS services to create innovative solutions that enhance and optimize the communications sent through Pinpoint.
  • My current provider does not have good deliverability
    • Price is not the only factor to consider when looking at SMS providers. If you find another provider with lower pricing make sure to ask about their deliverability to the countries you are wanting to send to. There is a big difference between sending an SMS at a low price, and actually delivering that SMS. We are happy to discuss deliverability with you, just reach out to your Account Manager if you have one or contact us to start a conversation about your migration.
  • I’m not happy with the customer support of my current provider
    • The SMS landscape is constantly changing and our SMS experts are here to help guide you through the process. Whether it’s regulatory changes, pricing changes, or creating complex architectures to support your needs. Reach out to your Account Manager if you have one or contact us to start a conversation about your migration and get your questions answered.

Regardless of your reason for considering migrating there are four scenarios that most of our customers find themselves in when beginning to plan for an SMS migration.

I have not sent SMS before but I would like to start sending through Pinpoint
Skip ahead to the section on “Checklist for Planning an SMS Migration” to start planning for sending SMS

I have number(s) (Also known as Originators, Origination Identities (OIDs), Toll-Free, 10DLC, Long Code, Short Code, and/or SenderID) with a different provider and I would like to move those to Pinpoint
The ability to “port” numbers from other providers is dependent on the type of originator, the vendor you procured them from, and the country that they support. You may need to get new originators so factor that into your timeline and reach out to your Account Manager to determine whether your originators are able to be ported over. Once you have done that, pull the reports for how much volume you are sending to each country with your current provider and then skip ahead to the section on “Checklist for Planning an SMS Migration” to start planning for sending SMS

I have a current provider but I would like to procure new numbers from Pinpoint
Pull the reports for how much volume you are sending to each country with your current provider and then skip ahead to the section on “Checklist for Planning an SMS Migration” to start planning for sending SMS

I have a current provider but would like to split traffic between them and Pinpoint
Pull the reports for how much volume you are sending to the countries you plan on migrating to AWS and then skip ahead to the section on “Checklist for Planning an SMS Migration” to start planning for sending SMS. Make sure that you consider how you will be managing opt-outs across two providers. Pinpoint offers centrally managed opt-outs but self-management is also an option. All Delivery Receipts/Reporting (DLRs) and inbound/outbound events can be streamed through Amazon Kinesis, Amazon CloudWatch, and/or Amazon Simple Notification Service (SNS) if you need to send those events to another location inside or outside of the AWS Cloud.

Checklist for Planning an SMS Migration

  • Setup a spreadsheet similar to the one outlined in this post
  • Identify your use case(s)
    • Note whether your use case is one-way or two-way
      • NOTE: Not all countries support 2-way communications, which is the ability to have the recipient send a message back to the OID.
      • NOTE: Sender ID also does not support 2-way communication so if you are planning on using Sender ID you will need to account for how to opt recipients out of future communications.
  • Identify your countries
  • Identify your volume per country
    • If you are already sending SMS with another provider pull a report over a representative time period.
  • Identify your throughput needs (Also referred to as Messages per Second, MPS, Transactions per Second, or TPS) for each country
    • Most origination identities are chosen for their ability to support a certain level of MPS, not volume, so if you have seasonality make sure to account for burst rates. There are quotas for the APIs that govern sending as well as quotas for the different types of originators.
  • Identify which origination identities you will need for each country using this guide
    • Make note of any countries/OIDs that require registration
    • Reach out to your Account Manager if you have one or contact us to start a conversation about your migration.
    • If you have OIDs you would like to migrate make sure you determine whether that is possible ASAP since your timelines could be affected by the outcome.

Make sure you give ample time for your migration. There are many entities involved in delivering SMS, from governments, to mobile carriers, to third-party registrars, and more, which means that timelines are not always within your control. Ask questions, take advantage of the expert resources we have at AWS, and the content we have produced around these topics.

Content to read

  • Review the countries and regions we support here
  • Use the format for aggregating information on your use cases outlined in this post here
  • Decide what origination IDs you will need here
  • Review the documentation for the V2 SMS and Voice API here
  • Review the Pinpoint API and SendMessage here
  • Check out the support tiers comparison here

Building a generative AI Marketing Portal on AWS

Post Syndicated from Tristan Nguyen original https://aws.amazon.com/blogs/messaging-and-targeting/building-a-generative-ai-marketing-portal-on-aws/

Introduction

In the preceding entries of this series, we examined the transformative impact of Generative AI on marketing strategies in “Building Generative AI into Marketing Strategies: A Primer” and delved into the intricacies of Prompt Engineering to enhance the creation of marketing content with services such as Amazon Bedrock in “From Prompt Engineering to Auto Prompt Optimisation”. We also explored the potential of Large Language Models (LLMs) to refine prompts for more effective customer engagement.

Continuing this exploration, we will articulate how Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint can be leveraged to construct a marketer portal that not only facilitates AI-driven content generation but also personalizes and distributes this content effectively. The aim is to provide a clear blueprint for deploying a system that crafts, personalizes, and distributes marketing content efficiently. This blog will guide you through the deployment process, underlining the real-world utility of these services in optimizing marketing workflows. Through use cases and a code demonstration, we’ll see these technologies in action, offering a hands-on perspective on enhancing your marketing pipeline with AI-driven solutions.

The Challenge with Content Generation in Marketing

Many companies struggle to streamline their marketing operations effectively, facing hurdles at various stages of the marketing operations pipeline. Below, we list the challenges at three main stages of the pipeline: content generation, content personalization, and content distribution.

Content Generation

Creating high-quality, engaging content is often easier said than done. Companies need to invest in skilled copywriters or content creators who understand not just the product but also the target audience. Even with the right talent, the process can be time-consuming and costly. Moreover, generating content at scale while maintaining quality and compliance to industry regulations is the key blocker for many companies considering adopting generative AI technologies in production environments.

Content Personalization

Once the content is created, the next hurdle is personalization. In today’s digital age, generic content rarely captures attention. Customers expect content tailored to their needs, preferences, and behaviors. However, personalizing content is not straightforward. It requires a deep understanding of customer data, which often resides in siloed databases, making it difficult to create a 360-degree view of the customer.

Content Distribution

Finally, even the most captivating, personalized content is ineffective if it doesn’t reach the right audience at the right time. Companies often grapple with choosing the appropriate channels for content distribution, be it email, social media, or mobile notifications. Additionally, ensuring that the content complies with various regulations and doesn’t end up in spam folders adds another layer of complexity to the distribution phase. Sending at scale requires paying attention to deliverability, security and reliability which often poses significant challenges to marketers.

By addressing these challenges, companies can significantly improve their marketing operations and empower their marketers to be more effective. But how can this be achieved efficiently and at scale? The answer lies in leveraging the power of Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint, as we will explore in the following solution.

The Solution In Action

Before we dive into the details of the implementation, let’s take a look at the end result through the linked demo video.

Use Case 1: Banking/Financial Services Industry

You are a relationship manager working in the Consumer Banking department of a fictitious company called AnyCompany Bank. You are assigned a group of customers and would like to send out personalized and targeted communications to the channel of choice to every members of this group of customer.

Behind the scene, the marketer is utilizing Amazon Pinpoint to create the segment of customers they would like to target. The customers’ information and the marketer’s prompt are then fed into Amazon Bedrock to generate the marketing content, which is then sent to the customer via SMS and email using Amazon Pinpoint.

  • In the Prompt Iterator page, you can employ a process called “prompt engineering” to further optimize your prompt to maximize the effectiveness of your marketing campaigns. Please refer to this blog on the process behind engineering the prompt as well as how to apply an additional LLM model for auto-prompting. To get started, simply copy the sample banking prompt which has gone through the prompt engineering process in this page.
  • Next, you can either upload your customer group by uploading a .csv file (through “Importing a Segment”) or specify a customer group using pre-defined filter criteria based on your current customer database using Amazon Pinpoint.

UseCase1Segment

E.g.: The screenshot shows a sample filtered segment named ManagementOrRetired that only filters to customers who are management or retirees.

  • Once done, you can log into the marketer portal and choose the relevant segment that you’ve just created within the Amazon Pinpoint console.

PinpointSegment

  • You can then preview the customers and their information stored in your Amazon Pinpoint’s customer database. Once satisfied, we’re ready to start generating content for those customers!
  • Click on 1:1 Content Generator tab, your content is automatically generated for your first customer. Here, you can cycle through your customers one by one, and depending on the customer’s preferred language and channel, an email or SMS in the preferred language is automatically generated for them.
    • Generated SMS in English

PostiveSMS

    • A negative example showing proper prompt-engineering at work to moderate content. This happens if we try to insert data that does not make sense for the marketing content generator to output. In this case, the marketing generator refuses to output (justifiably) an advertisement for a 6-year-old on a secured instalment loan.

NegativeSMS

  • Finally, we choose to send the generated content via Amazon Pinpoint by clicking on “Send with Amazon Pinpoint”. In the back end, Amazon Pinpoint will orchestrate the sending of the email/SMS through the appropriate channels.
    • Alternatively, if the auto-generated content still did not meet your needs and you want to generate another draft, you can Disagree and try again.

Use Case 2: Travel & Hospitality

You are a marketing executive that’s working for an online air ticketing agency. You’ve been tasked to promote a specific flight from Singapore to Hong Kong for AnyCompany airline. You’d first like to identify which customers would be prime candidates to promote this flight leg to and then send out hyper-personalized message to them.

Behind the scene, instead of using Amazon Pinpoint to manually define the segment, the marketer in this case is leveraging AIML capabilities of Amazon Personalize to define the best group of customers to recommend the specific flight leg to them. Similar to the above use case, the customers’ information and LLM prompt are fed into the Amazon Bedrock, which generates the marketing content that is eventually sent out via Amazon Pinpoint.

  • Similar to the above use case, you’d need to go through a prompt engineering process to ensure that the content the LLM model is generating will be relevant and safe for use. To get started quickly, go to the Prompt Iterator page, you can use the sample airlines prompt and iterate from there.
  • Your company offers many different flight legs, aggregated from many different carriers. You first filter down to the flight leg that you want to promote using the Filters on the left. In this case, we are filtering for flights originating from Singapore (SRCCity) and going to Hong Kong (DSTCity), operated by AnyCompany Airlines.

PersonalizeInstructions

  • Now, let’s choose the number of customers that you’d like to generate. Once satisfied, you choose to start the batch segmentation job.
  • In the background, Amazon Personalize generates a group of customers that are most likely to be interested in this flight leg based on past interactions with similar flight itineraries.
  • Once the segmentation job is finished as shown, you can fetch the recommended group of customers and start generating content for them immediately, similar to the first use case.

Setup instructions

The setup instructions and deployment details can be found in the GitHub link.

Conclusion

In this blog, we’ve explored the transformative potential of integrating Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint to address the common challenges in marketing operations. By automating the content generation with Amazon Bedrock, personalizing at scale with Amazon Personalize, and ensuring precise content distribution with Amazon Pinpoint, companies can not only streamline their marketing processes but also elevate the customer experience.

The benefits are clear: time-saving through automation, increased operational efficiency, and enhanced customer satisfaction through personalized engagement. This integrated solution empowers marketers to focus on strategy and creativity, leaving the heavy lifting to AWS’s robust AI and ML services.

For those ready to take the next step, we’ve provided a comprehensive guide and resources to implement this solution. By following the setup instructions and leveraging the provided prompts as a starting point, you can deploy this solution and begin customizing the marketer portal to your business’ needs.

Call to Action

Don’t let the challenges of content generation, personalization, and distribution hold back your marketing potential. Deploy the Generative AI Marketer Portal today, adapt it to your specific needs, and watch as your marketing operations transform. For a hands-on start and to see this solution in action, visit the GitHub repository for detailed setup instructions.

Have a question? Share your experiences or leave your questions in the comment section.

About the Authors

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Philipp Kaindl

Philipp Kaindl

Philipp Kaindl is a Senior Artificial Intelligence and Machine Learning Solutions Architect at AWS. With a background in data science and
mechanical engineering his focus is on empowering customers to create lasting business impact with the help of AI. Outside of work, Philipp enjoys tinkering with 3D printers, sailing and hiking.

Bruno Giorgini

Bruno Giorgini

Bruno Giorgini is a Senior Solutions Architect specializing in Pinpoint and SES. With over two decades of experience in the IT industry, Bruno has been dedicated to assisting customers of all sizes in achieving their objectives. When he is not crafting innovative solutions for clients, Bruno enjoys spending quality time with his wife and son, exploring the scenic hiking trails around the SF Bay Area.

Kafka on Kubernetes: Reloaded for fault tolerance

Post Syndicated from Grab Tech original https://engineering.grab.com/kafka-on-kubernetes

Introduction

Coban – Grab’s real-time data streaming platform – has been operating Kafka on Kubernetes with Strimzi in
production for about two years. In a previous article (Zero trust with Kafka), we explained how we leveraged Strimzi to enhance the security of our data streaming offering.

In this article, we are going to describe how we improved the fault tolerance of our initial design, to the point where we no longer need to intervene if a Kafka broker is unexpectedly terminated.

Problem statement

We operate Kafka in the AWS Cloud. For the Kafka on Kubernetes design described in this article, we rely on Amazon Elastic Kubernetes Service (EKS), the managed Kubernetes offering by AWS, with the worker nodes deployed as self-managed nodes on Amazon Elastic Compute Cloud (EC2).

To make our operations easier and limit the blast radius of any incidents, we deploy exactly one Kafka cluster for each EKS cluster. We also give a full worker node to each Kafka broker. In terms of storage, we initially relied on EC2 instances with non-volatile memory express (NVMe) instance store volumes for
maximal I/O performance. Also, each Kafka cluster is accessible beyond its own Virtual Private Cloud (VPC) via a VPC Endpoint Service.

Fig. 1 Initial design of a 3-node Kafka cluster running on Kubernetes.

Fig. 1 shows a logical view of our initial design of a 3-node Kafka on Kubernetes cluster, as typically run by Coban. The Zookeeper and Cruise-Control components are not shown for clarity.

There are four Kubernetes services (1): one for the initial connection – referred to as “bootstrap” – that redirects incoming traffic to any Kafka pods, plus one for each Kafka pod, for the clients to target each Kafka broker individually (a requirement to produce or consume from/to a partition that resides on any particular Kafka broker). Four different listeners on the Network Load Balancer (NLB) listening on four different TCP ports, enable the Kafka clients to target either the bootstrap
service or any particular Kafka broker they need to reach. This is very similar to what we previously described in Exposing a Kafka Cluster via a VPC Endpoint Service.

Each worker node hosts a single Kafka pod (2). The NVMe instance store volume is used to create a Kubernetes Persistent Volume (PV), attached to a pod via a Kubernetes Persistent Volume Claim (PVC).

Lastly, the worker nodes belong to Auto-Scaling Groups (ASG) (3), one by Availability Zone (AZ). Strimzi adds in node affinity to make sure that the brokers are evenly distributed across AZs. In this initial design, ASGs are not for auto-scaling though, because we want to keep the size of the cluster under control. We only use ASGs – with a fixed size – to facilitate manual scaling operation and to automatically replace the terminated worker nodes.

With this initial design, let us see what happens in case of such a worker node termination.

Fig. 2 Representation of a worker node termination. Node C is terminated and replaced by node D. However the Kafka broker 3 pod is unable to restart on node D.

Fig. 2 shows the worker node C being terminated along with its NVMe instance store volume C, and replaced (by the ASG) by a new worker node D and its new, empty NVMe instance store volume D. On start-up, the worker node D automatically joins the Kubernetes cluster. The Kafka broker 3 pod that was running on the faulty worker node C is scheduled to restart on the new worker node D.

Although the NVMe instance store volume C is terminated along with the worker node C, there is no data loss because all of our Kafka topics are configured with a minimum of three replicas. The data is poised to be copied over from the surviving Kafka brokers 1 and 2 back to Kafka broker 3, as soon as Kafka broker 3 is effectively restarted on the worker node D.

However, there are three fundamental issues with this initial design:

  1. The Kafka clients that were in the middle of producing or consuming to/from the partition leaders of Kafka broker 3 are suddenly facing connection errors, because the broker was not gracefully demoted beforehand.
  2. The target groups of the NLB for both the bootstrap connection and Kafka broker 3 still point to the worker node C. Therefore, the network communication from the NLB to Kafka broker 3 is broken. A manual reconfiguration of the target groups is required.
  3. The PVC associating the Kafka broker 3 pod with its instance store PV is unable to automatically switch to the new NVMe instance store volume of the worker node D. Indeed, static provisioning is an intrinsic characteristic of Kubernetes local volumes. The PVC is still in Bound state, so Kubernetes does not take any action. However, the actual storage beneath the PV does not exist anymore. Without any storage, the Kafka broker 3 pod is unable to start.

At this stage, the Kafka cluster is running in a degraded state with only two out of three brokers, until a Coban engineer intervenes to reconfigure the target groups of the NLB and delete the zombie PVC (this, in turn, triggers its re-creation by Strimzi, this time using the new instance store PV).

In the next section, we will see how we have managed to address the three issues mentioned above to make this design fault-tolerant.

Solution

Graceful Kafka shutdown

To minimise the disruption for the Kafka clients, we leveraged the AWS Node Termination Handler (NTH). This component provided by AWS for Kubernetes environments is able to cordon and drain a worker node that is going to be terminated. This draining, in turn, triggers a graceful shutdown of the Kafka
process by sending a polite SIGTERM signal to all pods running on the worker node that is being drained (instead of the brutal SIGKILL of a normal termination).

The termination events of interest that are captured by the NTH are:

  • Scale-in operations by an ASG.
  • Manual termination of an instance.
  • AWS maintenance events, typically EC2 instances scheduled for upcoming retirement.

This suffices for most of the disruptions our clusters can face in normal times and our common maintenance operations, such as terminating a worker node to refresh it. Only sudden hardware failures (AWS issue events) would fall through the cracks and still trigger errors on the Kafka client side.

The NTH comes in two modes: Instance Metadata Service (IMDS) and Queue Processor. We chose to go with the latter as it is able to capture a broader range of events, widening the fault tolerance capability.

Scale-in operations by an ASG

Fig. 3 Architecture of the NTH with the Queue Processor.

Fig. 3 shows the NTH with the Queue Processor in action, and how it reacts to a scale-in operation (typically triggered manually, during a maintenance operation):

  1. As soon as the scale-in operation is triggered, an Auto Scaling lifecycle hook is invoked to pause the termination of the instance.
  2. Simultaneously, an Auto Scaling lifecycle hook event is issued to an Amazon Simple Queue Service (SQS) queue. In Fig. 3, we have also materialised EC2 events (e.g. manual termination of an instance, AWS maintenance events, etc.) that transit via Amazon EventBridge to eventually end up in the same SQS queue. We will discuss EC2 events in the next two sections.
  3. The NTH, a pod running in the Kubernetes cluster itself, constantly polls that SQS queue.
  4. When a scale-in event pertaining to a worker node of the Kubernetes cluster is read from the SQS queue, the NTH sends to the Kubernetes API the instruction to cordon and drain the impacted worker node.
  5. On draining, Kubernetes sends a SIGTERM signal to the Kafka pod residing on the worker node.
  6. Upon receiving the SIGTERM signal, the Kafka pod gracefully migrates the leadership of its leader partitions to other brokers of the cluster before shutting down, in a transparent manner for the clients. This behaviour is ensured by the controlled.shutdown.enable parameter of Kafka, which is enabled by default.
  7. Once the impacted worker node has been drained, the NTH eventually resumes the termination of the instance.

Strimzi also comes with a terminationGracePeriodSeconds parameter, which we have set to 180 seconds to give the Kafka pods enough time to migrate all of their partition leaders gracefully on termination. We have verified that this is enough to migrate all partition leaders on our Kafka clusters (about 60 seconds for 600 partition leaders).

Manual termination of an instance

The Auto Scaling lifecycle hook that pauses the termination of an instance (Fig. 3, step 1) as well as the corresponding resuming by the NTH (Fig. 3, step 7) are invoked only for ASG scaling events.

In case of a manual termination of an EC2 instance, the termination is captured as an EC2 event that also reaches the NTH. Upon receiving that event, the NTH cordons and drains the impacted worker node. However, the instance is immediately terminated, most likely before the leadership of all of its Kafka partition leaders has had the time to get migrated to other brokers.

To work around this and let a manual termination of an EC2 instance also benefit from the ASG lifecycle hook, the instance must be terminated using the terminate-instance-in-auto-scaling-group AWS CLI command.

AWS maintenance events

For AWS maintenance events such as instances scheduled for upcoming retirement, the NTH acts immediately when the event is first received (typically adequately in advance). It cordons and drains the soon-to-be-retired worker node, which in turn triggers the SIGTERM signal and the graceful termination of Kafka as described above. At this stage, the impacted instance is not terminated, so the Kafka partition leaders have plenty of time to complete their migration to other brokers.

However, the evicted Kafka pod has nowhere to go. There is a need for spinning up a new worker node for it to be able to eventually restart somewhere.

To make this happen seamlessly, we doubled the maximum size of each of our ASGs and installed the Kubernetes Cluster Autoscaler. With that, when such a maintenance event is received:

  • The worker node scheduled for retirement is cordoned and drained by the NTH. The state of the impacted Kafka pod becomes Pending.
  • The Kubernetes Cluster Autoscaler comes into play and triggers the corresponding ASG to spin up a new EC2 instance that joins the Kubernetes cluster as a new worker node.
  • The impacted Kafka pod restarts on the new worker node.
  • The Kubernetes Cluster Autoscaler detects that the previous worker node is now under-utilised and terminates it.

In this scenario, the impacted Kafka pod only remains in Pending state for about four minutes in total.

In case of multiple simultaneous AWS maintenance events, the Kubernetes scheduler would honour our PodDisruptionBudget and not evict more than one Kafka pod at a time.

Dynamic NLB configuration

To automatically map the NLB’s target groups with a newly spun up EC2 instance, we leveraged the AWS Load Balancer Controller (LBC).

Let us see how it works.

Fig. 4 Architecture of the LBC managing the NLB’s target groups via TargetGroupBinding custom resources.

Fig. 4 shows how the LBC automates the reconfiguration of the NLB’s target groups:

  1. It first retrieves the desired state described in Kubernetes custom resources (CR) of type TargetGroupBinding. There is one such resource per target group to maintain. Each TargetGroupBinding CR associates its respective target group with a Kubernetes service.
  2. The LBC then watches over the changes of the Kubernetes services that are referenced in the TargetGroupBinding CRs’ definition, specifically the private IP addresses exposed by their respective Endpoints resources.
  3. When a change is detected, it dynamically updates the corresponding NLB’s target groups with those IP addresses as well as the TCP port of the target containers (containerPort).

This automated design sets up the NLB’s target groups with IP addresses (targetType: ip) instead of EC2 instance IDs (targetType: instance). Although the LBC can handle both target types, the IP address approach is actually more straightforward in our case, since each pod has a routable private IP address in the AWS subnet, thanks to the AWS Container Networking Interface (CNI) plug-in.

This dynamic NLB configuration design comes with a challenge. Whenever we need to update the Strimzi CR, the rollout of the change to each Kafka pod in a rolling update fashion is happening too fast for the NLB. This is because the NLB inherently takes some time to mark each target as healthy before enabling it. The Kafka brokers that have just been rolled out start advertising their broker-specific endpoints to the Kafka clients via the bootstrap service, but those
endpoints are actually not immediately available because the NLB is still checking their health. To mitigate this, we have reduced the HealthCheckIntervalSeconds and HealthyThresholdCount parameters of each target group to their minimum values of 5 and 2 respectively. This reduces the maximum delay for the NLB to detect that a target has become healthy to 10 seconds. In addition, we have configured the LBC with a Pod Readiness Gate. This feature makes the Strimzi rolling deployment wait for the health check of the NLB to pass, before marking the current pod as Ready and proceeding with the next pod.

Fig. 5 Steps for a Strimzi rolling deployment with a Pod Readiness Gate. Only one Kafka broker and one NLB listener and target group are shown for simplicity.

Fig. 5 shows how the Pod Readiness Gate works during a Strimzi rolling deployment:

  1. The old Kafka pod is terminated.
  2. The new Kafka pod starts up and joins the Kafka cluster. Its individual endpoint for direct access via the NLB is immediately advertised by the Kafka cluster. However, at this stage, it is not reachable, as the target group of the NLB still points to the IP address of the old Kafka pod.
  3. The LBC updates the target group of the NLB with the IP address of the new Kafka pod, but the NLB health check has not yet passed, so the traffic is not forwarded to the new Kafka pod just yet.
  4. The LBC then waits for the NLB health check to pass, which takes 10 seconds. Once the NLB health check has passed, the NLB resumes forwarding the traffic to the Kafka pod.
  5. Finally, the LBC updates the pod readiness gate of the new Kafka pod. This informs Strimzi that it can proceed with the next pod of the rolling deployment.

Data persistence with EBS

To address the challenge of the residual PV and PVC of the old worker node preventing Kubernetes from mounting the local storage of the new worker node after a node rotation, we adopted Elastic Block Store (EBS) volumes instead of NVMe instance store volumes. Contrary to the latter, EBS volumes can conveniently be attached and detached. The trade-off is that their performance is significantly lower.

However, relying on EBS comes with additional benefits:

  • The cost per GB is lower, compared to NVMe instance store volumes.
  • Using EBS decouples the size of an instance in terms of CPU and memory from its storage capacity, leading to further cost savings by independently right-sizing the instance type and its storage. Such a separation of concerns also opens the door to new use cases requiring disproportionate amounts of storage.
  • After a worker node rotation, the time needed for the new node to get back in sync is faster, as it only needs to catch up the data that was produced during the downtime. This leads to shorter maintenance operations and higher iteration speed. Incidentally, the associated inter-AZ traffic cost is also lower, since there is less data to transfer among brokers during this time.
  • Increasing the storage capacity is an online operation.
  • Data backup is supported by taking snapshots of EBS volumes.

We have verified with our historical monitoring data that the performance of EBS General Purpose 3 (gp3) volumes is significantly above our maximum historical values for both throughput and I/O per second (IOPS), and we have successfully benchmarked a test EBS-based Kafka cluster. We have also set up new monitors to be alerted in case we need to
provision either additional throughput or IOPS, beyond the baseline of EBS gp3 volumes.

With that, we updated our instance types from storage optimised instances to either general purpose or memory optimised instances. We added the Amazon EBS Container Storage Interface (CSI) driver to the Kubernetes cluster and created a new Kubernetes storage class to let the cluster dynamically provision EBS gp3 volumes.

We configured Strimzi to use that storage class to create any new PVCs. This makes Strimzi able to automatically create the EBS volumes it needs, typically when the cluster is first set up, but also to attach/detach the volumes to/from the EC2 instances whenever a Kafka pod is relocated to a different worker node.

Note that the EBS volumes are not part of any ASG Launch Template, nor do they scale automatically with the ASGs.

Fig. 6 Steps for the Strimzi Operator to create an EBS volume and attach it to a new Kafka pod.

Fig. 6 illustrates how this works when Strimzi sets up a new Kafka broker, for example the first broker of the cluster in the initial setup:

  1. The Strimzi Cluster Operator first creates a new PVC, specifying a volume size and EBS gp3 as its storage class. The storage class is configured with the EBS CSI Driver as the volume provisioner, so that volumes are dynamically provisioned [1]. However, because it is also set up with volumeBindingMode: WaitForFirstConsumer, the volume is not yet provisioned until a pod actually claims the PVC.
  2. The Strimzi Cluster Operator then creates the Kafka pod, with a reference to the newly created PVC. The pod is scheduled to start, which in turn claims the PVC.
  3. This triggers the EBS CSI Controller. As the volume provisioner, it dynamically creates a new EBS volume in the AWS VPC, in the AZ of the worker node where the pod has been scheduled to start.
  4. It then attaches the newly created EBS volume to the corresponding EC2 instance.
  5. After that, it creates a Kubernetes PV with nodeAffinity and claimRef specifications, making sure that the PV is reserved for the Kafka broker 1 pod.
  6. Lastly, it updates the PVC with the reference of the newly created PV. The PVC is now in Bound state and the Kafka pod can start.

One important point to take note of is that EBS volumes can only be attached to EC2 instances residing in their own AZ. Therefore, when rotating a worker node, the EBS volume can only be re-attached to the new instance if both old and new instances reside in the same AZ. A simple way to guarantee this is to set up one ASG per AZ, instead of a single ASG spanning across 3 AZs.

Also, when such a rotation occurs, the new broker only needs to synchronise the recent data produced during the brief downtime, which is typically an order of magnitude faster than replicating the entire volume (depending on the overall retention period of the hosted Kafka topics).

Table 1 Comparison of the resynchronization of the Kafka data after a broker rotation between the initial design and the new design with EBS volumes.
Initial design (NVMe instance store volumes) New design (EBS volumes)
Data to synchronise All of the data Recent data produced during the brief downtime
Function of (primarily) Retention period Downtime
Typical duration Hours Minutes

Outcome

With all that, let us revisit the initial scenario, where a malfunctioning worker node is being replaced by a fresh new node.

Fig. 7 Representation of a worker node termination after implementing the solution. Node C is terminated and replaced by node D. This time, the Kafka broker 3 pod is able to start and serve traffic.

Fig. 7 shows the worker node C being terminated and replaced (by the ASG) by a new worker node D, similar to what we have described in the initial problem statement. The worker node D automatically joins the Kubernetes cluster on start-up.

However, this time, a seamless failover takes place:

  1. The Kafka clients that were in the middle of producing or consuming to/from the partition leaders of Kafka broker 3 are gracefully redirected to Kafka brokers 1 and 2, where Kafka has migrated the leadership of its leader partitions.
  2. The target groups of the NLB for both the bootstrap connection and Kafka broker 3 are automatically updated by the LBC. The connectivity between the NLB and Kafka broker 3 is immediately restored.
  3. Triggered by the creation of the Kafka broker 3 pod, the Amazon EBS CSI driver running on the worker node D re-attaches the EBS volume 3 that was previously attached to the worker node C, to the worker node D instead. This enables Kubernetes to automatically re-bind the corresponding PV and PVC to Kafka broker 3 pod. With its storage dependency resolved, Kafka broker 3 is able to start successfully and re-join the Kafka cluster. From there, it only needs to catch up with the new data that was produced
    during its short downtime, by replicating it from Kafka brokers 1 and 2.

With this fault-tolerant design, when an EC2 instance is being retired by AWS, no particular action is required from our end.

Similarly, our EKS version upgrades, as well as any operations that require rotating all worker nodes of the cluster in general, are:

  • Simpler and less error-prone: We only need to rotate each instance in sequence, with no need for manually reconfiguring the target groups of the NLB and deleting the zombie PVCs anymore.
  • Faster: The time between each instance rotation is limited to the short amount of time it takes for the restarted Kafka broker to catch up with the new data.
  • More cost-efficient: There is less data to transfer across AZs (which is charged by AWS).

It is worth noting that we have chosen to omit Zookeeper and Cruise Control in this article, for the sake of clarity and simplicity. In reality, all pods in the Kubernetes cluster – including Zookeeper and Cruise Control – now benefit from the same graceful stop, triggered by the AWS termination events and the NTH. Similarly, the EBS CSI driver improves the fault tolerance of any pods that use EBS volumes for persistent storage, which includes the Zookeeper pods.

Challenges faced

One challenge that we are facing with this design lies in the EBS volumes’ management.

On the one hand, the size of EBS volumes cannot be increased consecutively before the end of a cooldown period (minimum of 6 hours and can exceed 24 hours in some cases [2]). Therefore, when we need to urgently extend some EBS volumes because the size of a Kafka topic is suddenly growing, we need to be relatively generous when sizing the new required capacity and add a comfortable security margin, to make sure that we are not running out of storage in the short run.

On the other hand, shrinking a Kubernetes PV is not a supported operation. This can affect the cost efficiency of our design if we overprovision the storage capacity by too much, or in case the workload of a particular cluster organically diminishes.

One way to mitigate this challenge is to tactically scale the cluster horizontally (ie. adding new brokers) when there is a need for more storage and the existing EBS volumes are stuck in a cooldown period, or when the new storage need is only temporary.

What’s next?

In the future, we can improve the NTH’s capability by utilising webhooks. Upon receiving events from SQS, the NTH can also forward the events to the specified webhook URLs.

This can potentially benefit us in a few ways, e.g.:

  • Proactively spinning up a new instance without waiting for the old one to be terminated, whenever a termination event is received. This would shorten the rotation time even further.
  • Sending Slack notifications to Coban engineers to keep them informed of any actions taken by the NTH.

We would need to develop and maintain an application that receives webhook events from the NTH and performs the necessary actions.

In addition, we are also rolling out Karpenter to replace the Kubernetes Cluster Autoscaler, as it is able to spin up new instances slightly faster, helping reduce the four minutes delay a Kafka pod remains in Pending state during a node rotation. Incidentally, Karpenter also removes the need for setting up one ASG by AZ, as it is able to deterministically provision instances in a specific AZ, for example where a particular EBS volume resides.

Lastly, to ensure that the performance of our EBS gp3 volumes is both sufficient and cost-efficient, we want to explore autoscaling their throughput and IOPS beyond the baseline, based on the usage metrics collected by our monitoring stack.

References

[1] Dynamic Volume Provisioning | Kubernetes

[2] Troubleshoot EBS volume stuck in Optimizing state during modification | AWS re:Post

We would like to thank our team members and Grab Kubernetes gurus that helped review and improve this blog before publication: Will Ho, Gable Heng, Dewin Goh, Vinnson Lee, Siddharth Pandey, Shi Kai Ng, Quang Minh Tran, Yong Liang Oh, Leon Tay, Tuan Anh Vu.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Expanded Coverage and AWS Compliance Pack Updates in InsightCloudSec Coming Out of AWS Re:Invent 2023

Post Syndicated from Lara Sunday original https://blog.rapid7.com/2023/12/20/expanded-coverage-and-aws-compliance-pack-updates-in-insightcloudsec-coming-out-of-aws-re-invent-2023/

Expanded Coverage and AWS Compliance Pack Updates in InsightCloudSec Coming Out of AWS Re:Invent 2023

It seems like it was just yesterday that we were in Las Vegas for AWS Re:Invent, but it’s already been almost two weeks since the conference wrapped up. As is always the case, AWS unveiled a host of new services throughout the week, including advancements around serverless, artificial intelligence (AI) and Machine Learning (ML), security and more.

There were a ton of really exciting announcements, but a few stood out to me. Before we dive into the new and updated services we now support in InsightCloudSec, let’s take a second to highlight a few of them and why they’re of note.

Highlights from AWS’ New Service Announcements during Re:Invent

Amazon Bedrock general availability was announced back in October, re:Invent brought with it announcements of new capabilities including customized models, GenAI applications to execute multi-step tasks, and Guardrails announced in preview. New Security Hub functionalities were introduced, including centralized governance, custom controls and a refresh of the dashboard.

Serverless innovations include updates to Amazon Aurora Limitless Database, Amazon ElasticCache Serverless, and AI-driven Amazon Redshift Serverless adding greater scaling and efficiency to their database and analytics offerings. Serverless architectures bring scalability and flexibility, however security and risk considerations shift away from traditional network traffic inspection and access control lists, towards IAM hygiene, system identity behavioral analysis along with code integrity and validation.

Amazon Datazone general availability, like Bedrock, was originally announced in October and got some new innovations showcased during Re:Invent including business driven domains and data catalog, projects and environments, and the ability for data workers to publish and data consumers to subscribe to workflows. Available in open preview for Datazone are automated, AI-driven recommendations for metadata-driven business descriptions and specific columns and analytical applications based on business units.

One of the most exciting announcements from Re:Invent this year was Amazon Q, Amazon’s new GenAI-powered Virtual Assistant. Q was also integrated into Amazon’s Business Intelligence (BI) service, QuickSight, which has been supported in InsightCloudSec for some time now.

Having released our support for Amazon OpenSearch last year, this year’s re:Invent brought some exciting updates that are worth mentioning here. Now generally available is Vector Engine for OpenSearch Serverless, which enables users to store and quickly search vector embeddings for GenAI applications. AWS also announced the OR1 Instance family, which is compute optimized specifically for OpenSearch and also a new zero-ETL integration with S3.

Expanded Resource Coverage in InsightCloudSec

It’s very important to us here at Rapid7 that we provide our customers with the peace of mind to know when their teams leave these events and begin implementing new innovations from AWS that they’re doing so securely. To that end, the days and weeks following Re:Invent is always a bit of a sprint, and this year was no exception.

The Coverage and Analysis team loves a challenge though, and in my totally unbiased opinion — we’ve delivered something special. Our latest release featured new support for a variety of the new services announced during Re:Invent, as well as, a number of existing services we’ve expanded support for in relation to updates announced by AWS. We’ve added support for 6 new services that were either announced or updated during the show. We’ve also added 25 new Insights, all of which have been applied to our existing AWS Foundational Security Best Practices pack, AWS Center for Internet Security (CIS) 2.0 compliance pack, as well as new AWS relevant updates to NIST SP800-53 (Rev 5).

The newly supported services are:

  • Bedrock, a fully managed service that allows users to build generative AI applications in the cloud by providing a set of foundational models both from AWS and 3rd party vendors.
  • Clean Rooms, which enables customers to collaborate and analyze data securely in ‘clean rooms’ in minutes with any other company on joint initiatives without sharing real raw data.
  • AWS Control Tower (January 2024 Release), a management service that can be used to create and orchestrate a multi-account AWS environment in accordance with AWS best practices including the Well-Architected Framework.

Along with support for newly-added services, we’ve also expanded our coverage around the host of existing services as well. We’ve added or expanded support for the following security and serverless solutions:

  • Network Firewall, which provides fine-grained control over network traffic.
  • Security Hub, an AWS’ native service that provides CSPM functionality, aggregating security and compliance checks.
  • Glue, a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources, empowering your analytics and ML projects.

Helping Teams Securely Build AI/ML Applications in the Cloud

One of the most exciting elements to come out of the past few weeks with the addition of AWS Bedrock, is our extended coverage for AI and ML solutions that we are now able to provide across cloud providers for our customers. Supporting AWS Bedrock, along with GCP Vertex and Azure OpenAI Service has enabled us to build a very exciting new feature as part of our Compliance Packs.

Machine learning, artificial intelligence, and analytics were driving themes of this year’s conference, so it makes me very happy to announce that we now offer a dedicated Rapid7 AI/ML Security Best Practices compliance pack. If interested, I highly recommend you keep an eye out in the coming days for my colleague Kathryn Lynas-Blunt’s blog discussing how Rapid7 enables teams to securely build AI applications in the cloud.

As a cloud enthusiast, AWS re:Invent never fails to deliver on innovation, excitement and shared learning experiences. As we continue our partnership with AWS, I’m very excited for all that 2024 holds in store. Until next year!

Expanded Coverage and New Attack Path Visualizations Help Security Teams Prioritize Cloud Risk and Understand Blast Radius

Post Syndicated from Pauline Logan original https://blog.rapid7.com/2023/12/19/expanded-coverage-and-new-attack-path-visualizations-help-security-teams-prioritize-cloud-risk-and-understand-blast-radius/

Expanded Coverage and New Attack Path Visualizations Help Security Teams Prioritize Cloud Risk and Understand Blast Radius

Cloud environments differ in a number of ways from more traditional on-prem environments. From the immense scale and compounding complexity to the rate of change, the cloud creates a host of challenges for security teams to navigate and grapple with. By definition, anything running in the cloud has the potential to be made publicly available, either directly or indirectly. The interconnected nature of these environments is such that when one account, resource, or service is compromised, it can be fairly easy for a bad actor to move laterally across your environment and/or grant themselves the permissions to wreak havoc. These avenues for lateral movement or privilege escalation are often referred to as attack paths.

Having a solution in place that can clearly and dynamically detect and depict these attack paths is critical to helping teams not only understand where risks exist across their environment but arguably more importantly how they are most likely to be exploited and what that means for an organization – particularly with respect to protecting high-value assets.

Detect and Remediate Attack Paths With InsightCloudSec

Attack Path Analysis in InsightCloudSec enables Rapid7 customers to see their cloud environments from the perspective of an attacker. It visualizes the various ways an attacker could gain access, move between resources, and compromise the cloud environment. Attack Paths are high fidelity signals in our risk prioritization model that focuses on identifying toxic combinations that lead to real business impact.

Since Rapid7 initially launched Attack Path Analysis, we’ve continued to roll out incremental updates to the feature, primarily in the form of expanded attack path coverage across each of the major cloud service providers (CSPs). In our most recent InsightCloudSec release (12.12.2023), we’ve continued this momentum, announcing additional attack paths as well as some exciting updates around how we visualize risk across paths and the potential blast radius should a compromised resource within an existing attack path be exploited. In this post, we’ll dive into an example of one of our recently added attack paths for Microsoft Azure along with a bit more detail about the new risk visualizations. So with that, let’s jump right in.

Expanding Coverage With New Attack Paths

First, on the coverage side of things we’ve added seven new paths in recent releases across AWS and Azure. Our AWS coverage was extended to support ECS across all of our AWS Attack Paths, and we also introduced 3 new Azure Attack paths. In the interest of brevity, we won’t cover each of them, but we do have an ever-developing list of supported attack paths you can access here on the docs page. As an example, however, let’s dive into one of the new paths we released for Azure, which identifies the presence of attack paths targeting publicly exposed instances that also have attached privileged roles.

Expanded Coverage and New Attack Path Visualizations Help Security Teams Prioritize Cloud Risk and Understand Blast Radius

This type of attack path is concerning for a couple of reasons: First and foremost, an attacker could use the publicly exposed instance as an inroad to your cloud environment due to the fact that it’s publicly accessible, gaining access to sensitive data on the resource itself or accessing data the resource in question has indirect access to. Secondly, since the attached role is capable of escalating privileges, an attacker could then leverage the resource to assign themselves admin permissions which could in turn be used to open up new attack vectors.

Because this could have wide-reaching ramifications should it be exploited, we’ve assigned this a critical severity. That means we’ll want to work to resolve this as fast as possible any time this path shows up across our cloud environments, maybe even automating the process of closing down public access or adjusting the resource permissions to limit the potential for lateral movement or privilege escalation. Speaking of paths with widespread impact should they be exploited, that brings me to some other exciting updates we’ve rolled out to Attack Path Analysis.

Clearly Visualizing Risk Severity and Potential Blast Radius

As I mentioned earlier, along with expanded coverage, we’ve also updated Attack Path Analysis to make it clearer for users where your riskiest assets lie across a given attack path and to clearly show the potential blast radius of an exploitation.

To make it easier to understand the overall riskiness of an attack path and where its choke points are, we’ve added a new security view that visualizes the risk of each resource along a given path. This new view makes it very easy for security teams to immediately understand which specific resources present the highest risk and where they should be focusing their remediation efforts to block potential attackers in their tracks.

Expanded Coverage and New Attack Path Visualizations Help Security Teams Prioritize Cloud Risk and Understand Blast Radius

In addition to this new security-focused view, we’ve also extended Attack Path Analysis to show a potential blast radius by displaying a graph-based topology map that helps clearly outline the various ways resources across your environment – and specifically within an attack path – interconnect with one another.

This topology map not only makes it easier for security teams to quickly hone in on what needs their attention first during an investigation, but also where a bad actor could move next. Additionally, this view helps security teams and leaders in communicating risk across the organization, particularly when engaging with non-technical stakeholders that find it difficult to understand why exactly a compromised resource presents a potentially larger risk to the business.

We will continue to expand on our existing Attack Path Analysis capabilities in the future, so be sure to keep an eye out for additional paths being added in the coming months as well as a continued effort to enable security teams to more quickly analyze cloud risk with the context needed to effectively detect, communicate, prioritize, and respond.

Monitoring AWS Cost Explorer with Zabbix

Post Syndicated from evgenii.gordymov original https://blog.zabbix.com/monitoring-aws-cost-explorer-with-zabbix/26159/

Cloud-based service platforms are becoming increasingly popular, and one of the most widely adopted is Amazon Web Services (AWS). Like many cloud services, AWS charges a user fee, which has led many users to look for a breakdown of which specific services they are being charged for. Fortunately, Zabbix has an AWS Cost Explorer over HTTP template that’s ready to run right out of the box and provides a list of daily and monthly maintenance costs.

Why monitor AWS costs?

While AWS cost data is stored for 12 months, Zabbix allows data to be stored for up to 25 years (see Keep lost resources period). The Keep lost resources period is a vital parameter for storing data longer than 12 months since the cost data removed from AWS will result in the discovered items becoming lost. Therefore, if we want to keep our cost data for a period longer than 12 months, Keep lost resources period parameter needs to be adjusted accordingly.

In addition, Zabbix can show fees charged for unavailable services, such as test deployments for a cluster in the us-east-1 region.

Preparing to monitor in a few easy steps

I recommend visiting zabbix.com/integrations/aws for any sources referred to in this tutorial. You can also find a link to all Zabbix templates there. For the most part, we will follow the steps outlined in the readme.

The AWS Cost Explorer by HTTP template can use key-based and role-based authorization. Set the following macros  {$AWS.AUTH_TYPE}, possible values: role_base, access_key (using by default).

If you are using access key-based authorization, be sure set the following macros {$AWS.ACCESS.KEY.ID}, {$AWS.SECRET.ACCESS.KEY}.

Create or use an existing access key, which you can get from Identity and Access Management (IAM).

Accessing the IAM Console:
  • Log in to your AWS Management Console.Navigate to the IAM service.
  • Next, go to the Users tab and select the required user.

Creating a access key for monitoring:
  • After that, go to the Security credentials tab.
  • Select Create access key.

Add the following required permissions to your Zabbix IAM policy in order to collect metrics.

Defining Permissions through IAM Policies:
  • Access the “Policies” section within IAM.
  • Click on “Create Policy”.
  • Select the JSON tab to define policy permissions.
  • Provide a meaningful name and description for the policy.
  • Structure the policy document based on the permissions needed for the AWS Cost Explorer by HTTP template.
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "ce:GetDimensionValues", "ce:GetCostAndUsage" ], "Effect": "Allow", "Resource": "*" } ] }

Attaching Policies to the User:
  •  Go back to the “Users” section within IAM.
  •  Click on “Add Permissions”.

– Search for and select the policy created in the previous step.

– Review the attached policies to ensure they align with the intended permissions for the user.

Creating a host in Zabbix

Now, let’s create a host that will represent the metrics available via the Cost Explorer API:

  • Create a Host Group in which to put hosts related to AWS. For this example, let’s create one that we’ll call AWS Cloud.
  • Head to the host page under Configuration and click Create host. Give this host the name AWS Cost. We’ll also assign this host to the AWS Cloud group we created and attach the AWS Cost Explorer template by HTTP.
  • Click the Macros tab and select Inherited and host macros. In this case, we need to change the first two macros. The first, {$AWS.ACCESS.KEY.ID}, should be set to the received access key ID. For the second, {$AWS.SECRET.ACCESS.KEY}, the secret access key should be set to the previously retrieved value from the Security credentials tab.
  • Click Add. The AWS Cost Explorer template has three low-level discovery rules that use master items. The low-level discovery rules will start discovering resources only after the master item has collected the required data.

    The best practice is to always test such items for data. Don’t forget to fill in the required macros!

    In AWS daily costs by services and AWS monthly costs by services discovery you can filter by service, which can be specified in macros.
  • Let’s execute the master items to collect the required data on-demand. Choose both items to get data and click Execute now.

    In a few minutes, you should receive cost metrics by services for 12 months plus the current month, as well as by day. If you want the information to be stored longer, remember to change the Keep lost resources period in the LLD rule, as it’s set to 30 days by default.

Good luck!

The post Monitoring AWS Cost Explorer with Zabbix appeared first on Zabbix Blog.

Build Better Engagement Using the AWS Community Engagement Flywheel: Part 2 of 3

Post Syndicated from Tristan Nguyen original https://aws.amazon.com/blogs/messaging-and-targeting/build-better-engagement-using-the-aws-community-engagement-flywheel-part-2-of-3/

Introduction

Part 2 of 3: From Cohorts to Campaigns

Businesses are constantly looking for better ways to engage with customer communities, but it’s hard to do when profile data is limited to user-completed form input or messaging campaign interaction metrics. Neither of these data sources tell a business much about their customer’s interests or preferences when they’re engaging with that community.

To bridge this gap for their community of customers, AWS Game Tech created the Cohort Modeler: a deployable solution for developers to map out and classify player relationships and identify like behavior within a player base. Additionally, the Cohort Modeler allows customers to aggregate and categorize player metrics by leveraging behavioral science and customer data. In our first blog post, we talked about how to extend Cohort Modeler’s functionality.

In this post, you’ll learn how to:

  1. Use the extension we built to create the first part of the Community Engagement Flywheel.
  2. Process the user extract from the Cohort Modeler and import the data into Amazon Pinpoint as a messaging-ready Segment.
  3. Send email to the users in the Cohort via Pinpoint’s powerful and flexible Campaign functionality.

Use Case Examples for The Cohort Modeler

For this example, we’re going to retrieve a cohort of individuals from our Cohort Modeler who we’ve identified as at risk:

  • Maybe they’ve triggered internal alarms where they’ve shared potential PII with others over cleartext.
  • Maybe they’ve joined chat channels known to be frequented by some of the game’s less upstanding citizens.

Either way, we want to make sure they understand the risks of what they’re doing and who they’re dealing with.

Pinpoint provides various robust methods to import user contact and personalization data in specific formats, and once Pinpoint has ingested that data, you can use Campaigns or Journeys to send customized and personalized messaging to your cohort members – either via automation, or manually via the Pinpoint Console.

Architecture overview

In this architecture, you’ll create a simple Amazon DynamoDB table that mimics a game studio’s database of record for its customers. You’ll then create a Trigger for Amazon Simple Storage Service (Amazon S3) bucket that will ingest the Cohort Modeler extract (created in the prior blog post) and convert it into a CSV file that Pinpoint can ingest. Lastly, once generated, the AWS Lambda function will prompt Pinpoint to automatically ingest the CSV as a static segment.

Once the automation is complete, you’ll use Pinpoint’s console to quickly and easily create a Campaign, including an HTML mail template, to the imported segment of players you identified as at risk via the Cohort Modeler.

Prerequisites

At this point, you should have completed the steps in the prior blog post, Extending the Cohort Modeler. This is all you’ll need to proceed.

Walkthrough

Messaging your Cohort

Now that we’ve extended the Cohort Modeler and built a way to extract cohort data into an S3 bucket, we’ll transform that data into a Segment in Pinpoint, and use the Pinpoint Console to send a message to the members of the Cohort via a Pinpoint Campaign. In this walkthrough, you’ll:

  • Create a Pinpoint Project to import your Cohort Segments.
  • Create a Dynamo table to emulate your database of record for your players.
  • Create an S3 bucket to hold the cohort contact data CSV file.
  • Create a Lambda trigger to respond to Cohort Modeler export events and kick off Pinpoint import jobs.
  • Create and send a Pinpoint Campaign using the imported Segment.

Create the Pinpoint Project

You’ll need a Pinpoint Project (sometimes referred to as an “App”) to send messaging to your cohort members, so navigate to the Pinpoint console and click Create a Project.

  • Sign in to the AWS Management Console and open the Pinpoint Console.
  • If this is your first time using Amazon Pinpoint, you will see a page that introduces you to the features of the service. In the Get started section, you’ll need to enter the name you want to call your project. We used ‘CohortModelerPinpoint‘ but you can use whatever you’d like.
  • On the following screen, the Configure features page, you’ll want to choose Configure in the Email section.
    • Pinpoint will ask you for an email address you want to validate, so that when email goes out, it will use your email address as the FROM header in your email. Enter the email address you want to use as your sending address, and Choose Verify email address.
    • Check the inbox of the address that you entered and look for an email from [email protected]. Open the email and click the link in the email to complete the verification process for the email address.
    • Note: Once you have verified your email identity, you may receive an alert prompting you to update your email address’ policy. If so, highlight your email under All identities, and choose Update policy. To complete this update, Enter confirm where requested, and choose Update.

  • Later on, when you’re asked for your Pinpoint Project ID, this can accessed by choosing All projects from the Pinpoint navigation pane. From there, next to your project name, you will see the associated Project ID.

Create the Dynamo Table

For this step, you’re emulating a game studio’s database of record for its players, and therefore the Lambda function that you’re creating, (to merge Cohort Modeler data with the database of record) is also an emulation.

In a real-world situation, you would use the same ingestion method as the S3TriggerCohortIngest.py example that will be created further below. However, instead of using placeholder data, you would use the ‘playerId’ information extracted from the Cohort Modeler. This would allow you to formulate a specific query against your main database, whether it requires an SQL statement, or some other type of database query.

Creating the Table

Navigate to the DynamoDB Console. You’re going to create a table with ‘playerId’ as the Primary key, and four additional attributes: email, favorite role, first name, and last name.

  • In the navigation pane, choose Tables. On the next page, in the Tables section, choose Create table.
  • In the Table details section, we entered userdata for our Table name. (In order to maintain simple compatibility with the scripts that follow, it is recommended that you do the same.)
  • For Partition key, enter playerId and leave the data type as String.
  • Intentionally leave the Sort key blank and the data type as String.
  • Below, in the Table settings section, leave everything at their Default settings value.
  • Scroll to the end of the page and choose Create table.
Adding Synthetic Data

You’ll need some synthetic data in the database, so that your Cohort Modeler-Pinpoint integration can query the database, retrieve contact information, and then import that contact information into Pinpoint as a Segment.

  • From the DynamoDB Tables section, choose your newly created Table by selecting its name. (The name preferably being userdata).
  • In the DynamoDB navigation pane, choose Explore items.
  • From the Items returned section, choose Create item.
  • Once on the Create item page, ensure that the Form view is highlighted and not the JSON view. You’re going to create a new entry in the table. Cohort Modeler creates the same synthetic information each time it’s built, so all you need to do is to create three entries.
    • For the first entry, enter wayne96 as the Value for playerID.
    • Select the Add new attribute dropdown, and choose String.
    • Enter email as the Attribute name, and the Value should be your own email address since you’ll be receiving this email. This should be the same email used to configure your Pinpoint project from earlier.
    • Again, select the Add new attribute dropdown, and choose String.
    • Enter favoriteRole as the Attribute name, and enter Tank as the attribute’s Value.
    • Again, select the Add new attribute dropdown, and choose String.
    • Enter firstName as the Attribute name, and enter Wayne as the attribute’s Value.
    • Finally, select the Add new attribute dropdown, and choose String.
    • And enter the lastName as the Attribute name, and enter Johnson as the attribute’s value.

  • Repeat the process for the following two users. You’ll be using the SES Mailbox Simulator on these player IDs – one will simulate a successful delivery (but no opens or clicks), and the other will simulate a bounce notification, which represents an unknown user response code.

 

A B C D E
1 playerId email favoriteRole firstName lastName
2 xortiz [email protected] Healer Tristan Nguyen
3 msmith [email protected] DPS Brett Ezell

Now that the table’s populated, you can build the integration between Cohort Modeler and your new “database of record,” allowing you to use the cohort data to send messages to your players.

Create the Pinpoint Import S3 Bucket

Pinpoint requires a CSV or JSON file stored on S3 to run an Import Segment job, so we’ll need a bucket (separate from our Cohort Modeler Export bucket) to facilitate this.

  • Navigate to the S3 Console, and inside the Buckets section, choose Create Bucket.
  • In the General configuration section, enter a bucket a name, remembering that its name must be unique across all of AWS.
  • You can leave all other settings at their default values, so scroll down to the bottom of the page and choose Create Bucket. Remember the name – We’ll be referring to it as your “Pinpoint import bucket” from here on out.
Create a Pinpoint Role for the S3 Bucket

Before creating the Lambda function, we need to create a role that allows the Cohort Modeler data to be imported into Amazon Pinpoint in the form of a segment.

For more details on how to create an IAM role to allow Amazon Pinpoint to import endpoints from the S3 Bucket, refer to this documentation. Otherwise, you can follow the instructions below:

  • Navigate to the IAM Dashboard. In the navigation pane, under Access management, choose Roles, followed by Create role.
  • Once on the Select trusted entity page, highlight and select AWS service, under the Trusted entity type section.
  • In the Use case section dropdown, type or select S3. Once selected, ensure that S3 is highlighted, and not S3 Batch Operations. Choose, Next.
  • From the Add permissions page, enter AmazonS3ReadOnlyAccess within Search area. Select the associated checkbox and choose Next.
  • Once on the Name, review, and create page, For Role name, enter PinpointSegmentImport. 
  • Scroll down and choose Create role.
  • From the navigation pane, and once again under Access management, choose Roles. Select the name of the role just created.
  • In the Trust relationships tab, choose Edit trust policy.
  • Paste the following JSON trust policy. Remember to replace accountId, region and application-id with your AWS account ID, the region you’re running Amazon Pinpoint from, and the Amazon Pinpoint project ID respectively.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "sts:AssumeRole",
            "Effect": "Allow",
            "Principal": {
                "Service": "pinpoint.amazonaws.com"
            },
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "accountId"
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:mobiletargeting:region:accountId:apps/application-id"
                }
            }
        }
    ]
}

Build the Lambda

You’ll need to create a Lambda function for S3 to trigger when Cohort Modeler drops its export files into the export bucket, as well as the connection to the Cohort Modeler export bucket to set up the trigger. The steps below will take you through the process.

Create the Lambda

Head to the Lambda service menu, and from Functions page, choose Create function. From there:

  • On the Create function page, select Author from scratch.
  • For Function Name, enter S3TriggerCohortIngest for consistency.
  • For Runtime choose Python 3.8
  • No other complex configuration options are needed, so leave the remaining options as default and click Create function.
  • In the Code tab, replace the sample code with the code below.
import json
import os
import uuid
import urllib

import boto3
from botocore.exceptions import ClientError

### S3TriggerCohortIngest

# We get activated once we're triggered by an S3 file getting Put.
# We then:
# - grab the file from S3 and ingest it.
# - negotiate with a DB of record (Dynamo in our test case) to pull the corresponding player data.
# - transform that record data into a format Pinpoint will interpret.
# - Save that CSV into a different S3 bucket, and
# - Instruct Pinpoint to ingest it as a Segment.


# save the CSV file to a random unique filename in S3
def save_s3_file(content):
    
    # generate a random uuid csv filename.
    fname = str(uuid.uuid4()) + ".csv"
    
    print("Saving data to file: " + fname)
    
    try:
        # grab the S3 bucket name
        s3_bucket_name = os.environ['S3BucketName']
        
        # Set up the S3 boto client
        s3 = boto3.resource('s3')
        
        # Lob the body into the object.
        object = s3.Object(s3_bucket_name, fname)
        object.put(Body=content)
        
        return fname
        
    # If we fail, say why and exit.
    except ClientError as error:
        print("Couldn't store file in S3: %s", json.dumps(error.response))
        return {
            'statuscode': 500,
            'body': json.dumps('Failed access to storage.')
        }
        
# Given a list of users, query the user dynamo db for their account info.
def query_dynamo(userlist):
    
    # set up the dynamo client.
    ddb_client = boto3.resource('dynamodb')
    
    # Set up the RequestIems object for our query.
    batch_keys = {
        'userdata': {
            'Keys': [{'playerId': user} for user in userlist]
        }
    }

    # query for the keys. note: currently no explicit error-checking for <= 100 items.     
    try:        
 
        db_response = ddb_client.batch_get_item(RequestItems=batch_keys)
 
 
     
        return db_response
        
    # If we fail, say why and exit.
    except ClientError as error:
        print("Couldn't access data in DynamoDB: %s", json.dumps(error.response))
        return {
            'statuscode': 500,
            'body': json.dumps('Failed access to db.')
        }
        
def ingest_pinpoint(filename):
    
    s3url = "s3://" + os.environ.get('S3BucketName') + "/" + filename
    
    
    try:
        pinClient = boto3.client('pinpoint')
        
        response = pinClient.create_import_job(
            ApplicationId=os.environ.get('PinpointApplicationID'),
            ImportJobRequest={
                'DefineSegment': True,
                'Format': 'CSV',
                'RegisterEndpoints': True,
                'RoleArn': 'arn:aws:iam::744969268958:role/PinpointSegmentImport',
                'S3Url': s3url,
                'SegmentName': filename
            }
        )
        
        return {
            'ImportId': response['ImportJobResponse']['Id'],
            'SegmentId': response['ImportJobResponse']['Definition']['SegmentId'],
            'ExternalId': response['ImportJobResponse']['Definition']['ExternalId'],
        }
        
    # If we fail, say why and exit.
    except ClientError as error:
        print("Couldn't create Import job for Pinpoint: %s", json.dumps(error.response))
        return {
            'statuscode': 500,
            'body': json.dumps('Failed segment import to Pinpoint.')
        }
        
# Lambda entry point GO
def lambda_handler(event, context):
    
    # Get the bucket + obj name from the incoming event
    incoming_bucket = event['Records'][0]['s3']['bucket']['name']
    filename = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    
    # light up the S3 client
    s3 = boto3.resource('s3')
    
    # grab the file that triggered us
    try:
        content_object = s3.Object(incoming_bucket, filename)
        file_content = content_object.get()['Body'].read().decode('utf-8')
        
        # and turn it into JSON.
        json_content = json.loads(file_content)
        
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(filename, incoming_bucket))
        raise e

    # Munge the file we got into something we can actually use
    record_content = json.dumps(json_content)

    # load it into json
    record_json = json.loads(record_content)
    
    # Initialize an empty list for names
    namelist = []
    
    # Iterate through the records in the list
    for record in record_json:
        # Check if "playerId" key exists in the record
        if "playerId" in record:
            # Append the first element of "playerId" list to namelist
            namelist.append(record["playerId"][0])

    # use the name list and grab the corresponding users from the dynamo table
    userdatalist = query_dynamo(namelist)
    
    # grab just what we need to create our import file
    userdata_responses = userdatalist["Responses"]["userdata"]
    
    csvlist = "ChannelType,Address,User.UserId,User.UserAttributes.FirstName,User.UserAttributes.LastName\n"
    
    for user in userdata_responses:
        newString = "EMAIL," + user["email"] + "," + user["playerId"] + "," + user["firstName"] + "," + user["lastName"] + "\n"
        csvlist += newString
        
    # Dump it to S3 with a unique filename. 
    csvFile = save_s3_file(csvlist)

    # and tell Pinpoint to import it as a Segment.
    pinResponse = ingest_pinpoint(csvFile)
    
    return {
        'statusCode': 200,
        'body': json.dumps(pinResponse)
    }

Configure the Lambda

Firstly, you’ll need to raise the function timeout, because sometimes it will take time to import large Pinpoint segments. To do so, navigate to the Configuration tab, then General configuration and change the Timeout value to the maximum of 15 minutes.

Next, select Environment variables beneath General configuration in the navigation pane. Choose Edit, followed by Add environment variable, for each Key and Value below.

  • Create a key – DynamoUserTableName – and give it the name of the DynamoDB table you built in the previous step. (If following our recommendations, it would be userdata. )
  • Create a key – PinpointApplicationID – and give it the Project ID (not the name), of the Pinpoint Project you created in the first step.
  • Create a key – S3BucketName – and give it the name of the Pinpoint Import S3 Bucket.
  • Finally, create a key – PinpointS3RoleARN – and paste the ARN of the Pinpoint S3 role you created during the Import Bucket creation step.
  • Once all Environment Variables are entered, choose Save.

In a production build, you could have this information stored in System Manager Parameter Store, in order to ensure portability and resilience.

While still in the Configuration tab, from the navigation pane, choose the Permissions menu option.

  • Note that just beneath Execution role, AWS has created an IAM Role for the Lambda. Select the role’s name to view it in the IAM console.
  • On the Role’s page, in the Permissions tab and within the Permissions policies section, you should see one policy attached to the role: AWSLambdaBasicExecutionRole
  • You will need to give the Lambda access to your Pinpoint import bucket, so highlight the Policy name and select the Add permissions dropdown and choose Create inline policy – we won’t be needing this role anywhere else.
  • On the next screen, click the JSON tab.
    • Paste the following IAM Policy JSON:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-PINPOINT-BUCKET-NAME-HERE/*",
                "arn:aws:s3:::YOUR-PINPOINT-BUCKET-NAME-HERE",
                "arn:aws:s3:::YOUR-CM-BUCKET-NAME-HERE/*",
                "arn:aws:s3:::YOUR-CM-BUCKET-NAME-HERE"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "dynamodb:BatchGetItem",
            "Resource": "arn:aws:dynamodb:region:accountId:table/userdata"
        },
        {
            "Effect": "Allow",
            "Action": "mobiletargeting:CreateImportJob",
            "Resource": "arn:aws:mobiletargeting:region:accountId:apps/application-id"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::accountId:role/PinpointSegmentImport"
        }
    ]
}
    • Replace the placeholder YOUR-CM-BUCKET-NAME-HERE with the name of the S3 Bucket you created in the previous blog post to store, and the YOUR-PINPOINT-BUCKET-NAME-HERE with the bucket to store Amazon Pinpoint segment endpoint you created earlier in the blog.
    • Remember to replace accountId, region and application-id with your AWS account ID, the region you’re running Amazon Pinpoint from, and the Amazon Pinpoint project ID respectively.
    • Choose Review Policy.
    • Give the policy a name – we used S3TriggerCohortIngestPolicy.
    • Finally, choose Create Policy.
Trigger the Lambda via S3

The goal is for the Lambda to be triggered when Cohort Modeler drops the extract file into its designated S3 delivery bucket. Fortunately, setting this up is a simple process:

  • Navigate back to the Lambda Functions page. For this particular Lambda script S3TriggerCohortIngest, choose the + Add trigger from the Function overview section.
    • From the Trigger configuration dropdown, select S3 as the source.
    • Under Bucket, enter or select the bucket you’ve chosen for Cohort Modeler extract delivery. (Created in the previous blog.)
    • Leave Event type as “All object create events
    • Leave both Prefix and Suffix blank.
    • Check the box that acknowledges that using the same bucket for input and output is not recommended, as it can increase Lambda usage and thereby costs.
    • Finally, choose Add.
    • Lambda will add the appropriate permissions to invoke the trigger when an object is created in the S3 bucket.
Test the Lambda

The best way to test the end to end process is to simply connect to the endpoint you created in the first step of the process and send it a valid query. I personally use Postman, but you can use curl or any other HTTP tool to send the request.

Again, refer back to your prior work to determine the HTTP API endpoint for your Cohort Modeler’s cohort extract endpoint, and then send it the following query:

https://YOUR-ENDPOINT.execute-api.YOUR-REGION.amazonaws.com/Prod/data/cohort/ea_atrisk?threshold=2

You should receive back a response that looks something like this:

{'statusCode': 200, 'body': 'export/ea_atrisk_2_2023-09-12_13-57-06.json'}

The Status code confirms that the request was successful, and the body provides the name of the export file which was created.

  • From the AWS console, navigate to the S3 Dashboard, and select the S3 Bucket you assigned to Cohort Modeler exports. You should see a JSON file corresponding to the response from your API call in that bucket.
  • Still in S3, navigate back and select the S3 bucket you assigned as your Pinpoint Import bucket. You should find a CSV file with the same file prefix in that bucket.
  • Finally, navigate to the Pinpoint dashboard and choose your Project.
  • From the navigation pane, select Segments. You should see a segment name which directly corresponds to the CSV file which you located in the Pinpoint Import bucket.

If these three steps are complete, then the outbound arm of the Community Engagement Flywheel is functional. All that’s left now is to test the Segment by using it in a Campaign.

Create an email template

In order to send your message recipients a message, you’ll need a message template. In this section, we’ll walk you through this process. The Pinpoint Template Editor is a simple HTML editor, but other third-party services like visual designers, can integrate directly with Pinpoint to provide a seamless integration between the design tool and Pinpoint.

  • From the navigation pane of the Pinpoint console, choose Message templates, and then select Create template.
  • Leave the Channel set to Email, and under Template name, enter a unique and memorable name.
  • Under Subject – We entered and used ‘Happy Video Game Day!’, but enter and use whatever you would like.
  • Locate and copy the contents of EmailTemplate.html, and paste the contents into the Message section of the form.
  • Finally, choose Create, and your Template will now be available for use.

Create & Send the Pinpoint Campaign

For the final step, you will create and send a campaign to the endpoints included in the Segment that the Community Engagement Flywheel created. Earlier, you mapped three email addresses to the identities that Cohort Modeler generated for your query: your email, and two test emails from the SES Email Simulator. As a result, you should receive one email to the email address you selected when you’ve completed this process, as well as events which indicate the status of all campaign activities.

  • In the navigation bar of the Pinpoint console, choose All projects, and select the project you’ve created for this exercise.
  • From the navigation pane, choose Campaigns, and then Create a campaign at the top of the page.
  • On the Create a campaign page, give your campaign a name, highlight Standard campaign, and choose Email for the Channel. To proceed, choose Next.
  • On the Choose a segment page, highlight Use an existing segment, and from the Segment dropdown, select the segment .csv that was created earlier. Once selected, choose Next.
  • On the Create your message page, you have two tasks:
    • You’re going to use the email template you created in the prior step, so in the Email template section, under Template name, select Choose a template, followed by the template you created, and finally Choose template.
    • In the Email settings section, ensure you’ve selected the sender email address you verified previously when you initially created the Pinpoint project.
    • Choose Next.
  • On the Choose when to send the campaign page, ensure Immediately is highlighted for when you want the campaign to be sent. Scroll down and choose Next.
  • Finally, on the Review and launch page, verify your selections as you scroll down the page, and finally Launch campaign.

Check your inbox! You will shortly receive the email, and this confirms the Campaign has been successfully sent.

Conclusion

So far you’ve extended the Cohort Modeler to report on the cohorts it’s built for you, you’ve operated on that extract and built an ETL machine to turn that cohort into relevant contact and personalization data, you’ve imported the contact data into Pinpoint as a static Segment, and you’ve created a Pinpoint Campaign witih that Segment to send messaging to that Cohort.

In the next and final blog post, we’ll show how to respond to events that result from your cohort members interacting with the messaging they’ve been sent, and how to enrich the cohort data with those events so you can understand in deeper detail how your messaging works – or doesn’t work – with your cohort members.

Related Content

About the Authors

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Brett Ezell

Brett Ezell

Brett Ezell is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. As a Navy veteran, he joined AWS in 2020 through an AWS technical military apprenticeship program. When he isn’t deep diving into solutions for customer challenges, Brett spends his time collecting vinyl, attending live music, and training at the gym. An admitted comic book nerd, he feeds his addiction every Wednesday by combing through his local shop for new books.

Build Better Engagement using the AWS Community Engagement Flywheel: Part 1 of 3

Post Syndicated from Tristan Nguyen original https://aws.amazon.com/blogs/messaging-and-targeting/build-better-engagement-using-the-aws-community-engagement-flywheel-part-1-of-3/

Introduction

Part 1 of 3: Extending the Cohort Modeler

Businesses are constantly looking for better ways to engage with customer communities, but it’s hard to do when profile data is limited to user-completed form input or messaging campaign interaction metrics. Neither of these data sources tell a business much about their customer’s interests or preferences when they’re engaging with that community.

To bridge this gap for their community of customers, AWS Game Tech created the Cohort Modeler: a deployable solution for developers to map out and classify player relationships and identify like behavior within a player base. Additionally, the Cohort Modeler allows customers to aggregate and categorize player metrics by leveraging behavioral science and customer data.

In this series of three blog posts, you’ll learn how to:

  1. Extend the Cohort Modeler’s functionality to provide reporting functionality.
  2. Use Amazon Pinpoint, the Digital User Engagement Events Database (DUE Events Database), and the Cohort Modeler together to group your customers into cohorts based on that data.
  3. Interact with them through automation to send meaningful messaging to them.
  4. Enrich their behavioral profiles via their interaction with your messaging.

In this blog post, we’ll show how to extend Cohort Modeler’s functionality to include and provide cohort reporting and extraction.

Use Case Examples for The Cohort Modeler

For this example, we’re going to retrieve a cohort of individuals from our Cohort Modeler who we’ve identified as at risk:

  • Maybe they’ve triggered internal alarms where they’ve shared potential PII with others over cleartext
  • Maybe they’ve joined chat channels known to be frequented by some of the game’s less upstanding citizens.

Either way, we want to make sure they understand the risks of what they’re doing and who they’re dealing with.

Because the Cohort Modeler’s API automatically translates the data it’s provided into the graph data format, the request we’re making is an easy one: we’re simply asking CM to retrieve all of the player IDs where the player’s ea_atrisk attribute value is greater than 2.

In our case, that either means

  1. They’ve shared PII at least twice, or shared PII at least once.
  2. Joined the #give-me-your-credit-card chat channel, which is frequented by real-life scammers.

These are currently the only two activities which generate at-risk data in our example model.

Architecture overview

In this example, you’ll extend Cohort Modeler’s functionality by creating a new API resource and method, and test that functional extension to verify it’s working. This supports our use case by providing a studio with a mechanism to identify the cohort of users who have engaged in activities that may put them at risk for fraud or malicious targeting.

CohortModelerExtensionArchitecture

Prerequisites

This blog post series integrates two tech stacks: the Cohort Modeler and the Digital User Engagement Events Database, both of which you’ll need to install. In addition to setting up your environment, you’ll need to clone the Community Engagement Flywheel repository, which contains the scripts you’ll need to use to integrate Cohort Modeler and Pinpoint.

You should have the following prerequisites:

Walkthrough

Extending the Cohort Modeler

In order to meet our functional requirements, we’ll need to extend the Cohort Modeler API. This first part will walk you through the mechanisms to do so. In this walkthrough, you’ll:

  • Create an Amazon Simple Storage Service (Amazon S3) bucket to accept exports from the Cohort Modeler
  • Create an AWS Lambda Layer to support Python operations for Cohort Modeler’s Gremlin interface to the Amazon Neptune database
  • Build a Lambda function to respond to API calls requesting cohort data, and
  • Integrate the Lambda with the Amazon API Gateway.

The S3 Export Bucket

Normally it’d be enough to just create the S3 Bucket, but because our Cohort Modeler operates inside an Amazon Virtual Private Cloud (VPC), we need to both create the bucket and create an interface endpoint.

Create the Bucket

The size of a Cohort Modeler extract could be considerable depending on the size of a cohort, so it’s a best practice to deliver the extract to an S3 bucket. All you need to do in this step is create a new S3 bucket for Cohort Modeler exports.

  • Navigate to the S3 Console page, and inside the main pane, choose Create Bucket.
  • In the General configuration section, enter a bucket a name, remembering that its name must be unique across all of AWS.
  • You can leave all other settings at their default values, so scroll down to the bottom of the page and choose Create Bucket. Remember the name – I’ll be referring to it as your “CM export bucket” from here on out.

Create S3 Gateway endpoint

When accessing “global” services, like S3 (as opposed to VPC services, like EC2) from inside a private VPC, you need to create an Endpoint for that service inside the VPC. For more information on how Gateway Endpoints for Amazon S3 work, refer to this documentation.

  • Open the Amazon VPC console.
  • In the navigation pane, under Virtual private cloud, choose Endpoints.
  • In the Endpoints pane, choose Create endpoint.
  • In the Endpoint settings section, under Service category, select AWS services.
  • In the Services section, under find resources by attribute, choose Type, and select the filter Type: Gateway and select com.amazonaws.region.s3.
  • For VPC section, select the VPC in which to create the endpoint.
  • For Route tables, section, select the route tables to be used by the endpoint. We automatically add a route that points traffic destined for the service to the endpoint network interface.
  • In the Policy section, select Full access to allow all operations by all principals on all resources over the VPC endpoint. Otherwise, select Custom to attach a VPC endpoint policy that controls the permissions that principals have to perform actions on resources over the VPC endpoint.
  • (Optional) To add a tag, choose Add new tag in the Tags section and enter the tag key and the tag value.
  • Choose Create endpoint.

Create the VPC Endpoint Security Group

When accessing “global” services, like S3 (as opposed to VPC services, like EC2) from inside a private VPC, you need to create an Endpoint for that service inside the VPC. One of the things the Endpoint needs to know is what network interfaces to accept connections from – so we’ll need to create a Security Group to establish that trust.

  • Navigate to the Amazon VPC console and In the navigation pane, under Security, choose Security groups.
  • In the Security Groups pane choose Create security group.
  • Under the Basic details section, name your security group S3 Endpoint SG.
  • Under the Outbound Rules section, choose Add Rule.
    • Under Type, select All traffic.
    • Under Source, leave Custom selected.
    • For the Custom Source, open the dropdown and choose the S3 gateway endpoint (this should be named pl-63a5400a)
    • Repeat the process for Outbound rules.
    • When finished, choose Create security group

Creating a Lambda Layer

You can use the code as provided in a Lambda, but the gremlin libraries required for it to run are another story: gremlin_python doesn’t come as part of the default Lambda libraries. There are two ways to address this:

  • You can upload the libraries with the code in a .zip file; this will work, but it will mean the Lambda isn’t editable via the built-in editor, which isn’t a scalable technique (and makes debugging quick changes a real chore).
  • You can create a Lambda Layer, upload those libraries separately, and then associate them with the Lambda you’re creating.

The Layer is a best practice, so that’s what we’re going to do here.

Creating the zip file

In Python, you’ll need to upload a .zip file to the Layer, and all of your libraries need to be included in paths within the /python directory (inside the zip file) to be accessible. Use pip to install the libraries you need into a blank directory so you can zip up only what you need, and no more.

  • Create a new subdirectory in your user directory,
  • Create a /python subdirectory,
  • Invoke pip3 with the —target option:
pip install --target=./python gremlinpython

Ensure that you’re zipping the python folder, the resultant file should be named python.zip and extracts to a python folder.

Creating the Layer

Head to the Lambda console, and select the Layers menu option from the AWS Lambda navigation pane. From there:

  • Choose Create layer in the Layer’s section
  • Give it a relevant name – like gremlinpython .
  • Select Upload a .zip file and upload the zip file you just created
  • For Compatible architectures, select x86_64.
  • Select the Python 3.8 as your runtime,
  • Choose Create.

Assuming all steps have been followed, you’ll receive a message that the layer has been successfully created.

Building the Lambda

You’ll be extending the Cohort Modeler with new functionality, and the way CM manages its functionality is via microservice-based Lambdas. You’ll be building a new API: to query the CM and extract Cohort information to S3.

Create the Lambda

Head back to the Lambda service menu, in the Resources for (your region) section, choose Create Function. From there:

  • On the Create function page select Author from scratch.
  • For Function Name enter ApiCohortGet for consistency.
  • For Runtime choose Python 3.8.
  • For Architectures, select x86_64.
  • Under the Advanced Settings pane select Enable VPC – you’re going to need this Lambda to query Cohort Modeler’s Neptune database, which has VPC endpoints.
    • Under VPC select the VPC created by the Cohort Modeler installation process.
    • Select all subnets in the VPC.
    • Select the security group labeled as the Security Group for API Lambda functions (also installed by CM)
    • Furthermore, select the security group S3 Endpoint SG we created, this allows the Lambda function hosted inside the VPC to access the S3 bucket.
  • Choose Create Function.
  • In the Code tab, and within the Code source window, delete all of the sample code and replace it with the code below. This python script will allow you to query Cohort Modeler for cohort extracts.
import os
import json
import boto3
from datetime import datetime
from gremlin_python import statics
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.driver.protocol import GremlinServerError
from gremlin_python.driver import serializer
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.process.traversal import T, P
from aiohttp.client_exceptions import ClientConnectorError
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3 = boto3.client('s3')

def query(g, cohort, thresh):
    return (g.V().hasLabel('player')
            .has(cohort, P.gt(thresh))
            .valueMap("playerId", cohort)
            .toList())

def doQuery(g, cohort, thresh):
    return query(g, cohort, thresh)

# Lambda handler
def lambda_handler(event, context):
    
    # Connection instantiation
    conn = create_remote_connection()
    g = create_graph_traversal_source(conn)
    try:
        # Validate the cohort info here if needed.

        # Grab the event resource, method, and parameters.
        resource = event["resource"]
        method = event["httpMethod"]
        pathParameters = event["pathParameters"]

        # Grab query parameters. We should have two: cohort and threshold
        queryParameters = event.get("queryStringParameters", {})

        cohort_val = pathParameters.get("cohort")
        thresh_val = int(queryParameters.get("threshold", 0))

        result = doQuery(g, cohort_val, thresh_val)

        
        # Convert result to JSON
        result_json = json.dumps(result)
        
        # Generate the current timestamp in the format YYYY-MM-DD_HH-MM-SS
        current_timestamp = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
        
        # Create the S3 key with the timestamp
        s3_key = f"export/{cohort_val}_{thresh_val}_{current_timestamp}.json"

        # Upload to S3
        s3_result = s3.put_object(
            Bucket=os.environ['S3ExportBucket'],
            Key=s3_key,
            Body=result_json,
            ContentType="application/json"
        )
        response = {
            'statusCode': 200,
            'body': s3_key
        }
        return response

    except Exception as e:
        logger.error(f"Error occurred: {e}")
        return {
            'statusCode': 500,
            'body': str(e)
        }

    finally:
        conn.close()

# Connection management
def create_graph_traversal_source(conn):
    return traversal().withRemote(conn)

def create_remote_connection():
    database_url = 'wss://{}:{}/gremlin'.format(os.environ['NeptuneEndpoint'], 8182)
    return DriverRemoteConnection(
        database_url,
        'g',
        pool_size=1,
        message_serializer=serializer.GraphSONSerializersV2d0()
    )

Configure the Lambda

Head back to the Lambda service page, and fom the navigation pane, select Functions.  In the Functions section select ApiCohortGet from the list.

  • In the Function overview section, select the Layers icon beneath your Lambda name.
  • In the Layers section, choose Add a layer.
  • From the Choose a layer section, select Layer Source to Custom layers.
  • From the dropdown menu below, select your recently custom layer, gremlinpython.
  • For Version, select the appropriate (probably the highest, or most recent) version.
  • Once finished, choose Add.

Now, underneath the Function overview, navigate to the Configuration tab and choose Environment variables from the navigation pane.

  • Now choose edit to create a new variable. For the key, enter NeptuneEndpoint , and give it the value of the Cohort Modeler’s Neptune Database endpoint. This value is available from the Neptune control panel under Databases. This should not be the read-only cluster endpoint, so select the ‘writer’ type. Once selected, the Endpoint URL will be listed beneath the Connectivity & security tab
  • Create an additional new key titled,  S3ExportBucket and for the value use the unique name of the S3 bucket you created earlier to receive extracts from Cohort Modeler. Once complete, choose save
  • In a production build, you can have this information stored in System Manager Parameter Store in order to ensure portability and resilience.

While still in the Configuration tab, under the navigation pane choose Permissions.

  • Note that AWS has created an IAM Role for the Lambda. select the role name to view it in the IAM console.
  • Under the Permissions tab, in the Permisions policies section, there should be two policies attached to the role: AWSLambdaBasicExecutionRole and AWSLambdaVPCAccessExecutionRole.
  • You’ll need to give the Lambda access to your CM export bucket
  • Also in the Permissions policies section, choose the Add permissions dropdown and select Create Inline policy – we won’t be needing this role anywhere else.
  • On the new page, choose the JSON tab.
    • Delete all of the sample code within the Policy editor, and paste the inline policy below into the text area.
    • {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Effect": "Allow",
                  "Action": "s3:*",
                  "Resource": [
                      "arn:aws:s3:::YOUR-S3-BUCKET-NAME-HERE",
                      "arn:aws:s3:::YOUR-S3-BUCKET-NAME-HERE /*"
                  ]
              }
          ]
      }
  • Replace the placeholder YOUR-S3-BUCKET-NAME-HERE with the name of your CM export bucket.
  • Click Review Policy.
  • Give the policy a name – I used ApiCohortGetS3Policy.
  • Click Create Policy.

Integrating with API Gateway

Now you’ll need to establish the API Gateway that the Cohort Modeler created with the new Lambda functions that you just created. If you’re on the old console User Interface, we strongly recommend switching over to the new console UI. This is due to the previous UI being deprecated by the 30th of October 2023. Consequently, the following instructions will apply to the new console UI.

  • Navigate to the main service page for API Gateway.
  • From the navigation pane, choose Use the new console.

APIGatewayNewConsole

Create the Resource

  • From the new console UI, select the name of the API Gateway from the APIs Section that corresponds to the name given when you launched the SAM template.
  • On the Resources navigation pane, choose /data, followed by selecting Create resource.
  • Under Resource name, enter cohort, followed by Create resource.

CreateNewResource

We’re not quite finished. We want to be able to ask the Cohort Modeler to give us a cohort based on a path parameter – so that way when we go to /data/cohort/COHORT-NAME/ we receive back information about the cohort name that we provided. Therefore…

Create the Method

CreateMethod

Now we’ll create the GET Method we’ll use to request cohort data from Cohort Modeler.

  • From the same menu, choose the /data/cohort/{cohort} Resource, followed by selecting Get from the Methods dropdown section, and finally choosing Create Method.
  • From the Create method page, select GET under Method type, and select Lambda function under the Integration type.
  • For the  Lambda proxy integration, turn the toggle switch on.
  • Under Lamba function, choose the function ApiCohortGet, created previously.
  • Finally, choose Create method.
  • API Gateway will prompt and ask for permissions to access the Lambda – this is fine, choose OK.

Create the API Key

You’ll want to access your API securely, so at a minimum you should create an API Key and require it as part of your access model.

CreateAPIKey

  • Under the API Gateway navigation pane, choose APIs. From there, select API Keys, also under the navigation pane.
  • In the API keys section, choose Create API key.
  • On the Create API key page, enter your API Key name, while leaving the remaining fields at their default values. Choose Save to complete.
  • Returning to the API keys section, select and copy the link for the API key which was generated.
  • Once again, select APIs from the navigation menu, and continue again by selecting the link to your CM API from the list.
  • From the navigation pane, choose API settings, folded under your API name, and not the Settings option at the bottom of the tab.

  • In the API details section, choose Edit under API details. Once on the Edit API settings page, ensure the Header option is selected under API key source.

Deploy the API

Now that you’ve made your changes, you’ll want to deploy the API with the new endpoint enabled.

  • Back in the navigation pane, under your CM API’s dropdown menu, choose Resources.
  • On the Resources page for your CM API, choose Deploy API.
  • Select the Prod stage (or create a new stage name for testing) and click Deploy.

Test the API

When the API has deployed, the system will display your API’s URL. You should now be able to test your new Cohort Modeler API:

  • Using your favorite tool (curl, Postman, etc.) create a new request to your API’s URL.
    • The URL should look like https://randchars.execute-api.us-east-1.amazonaws.com/Stagename. You can retrieve your APIGateway endpoint URL by selecting API Settings, in the navigation pane of your CM API’s dropdown menu.
    • From the API settings page, under Default endpoint, will see your Active APIGateway endpoint URL. Remember to add the Stagename (for example, “Prod) at the end of the URL.

    • Be sure you’re adding a header named X-API-Key to the request, and give it the value of the API key you created earlier.
    • Add the /data/cohort resource to the end of the URL to access the new endpoint.
    • Add /ea_atrisk after /data/cohort – you’re querying for the cohort of players who belong to the at-risk cohort.
    • Finally, add ?threshold=2 so that we’re only looking at players whose cohort value (in this case, the number of times they’ve shared personally identifiable information) is greater than 2. The final URL should look something like: https://randchars.execute-api.us-east-1.amazonaws.com/Stagename/data/cohort/ea_atrisk?threshold=2
  • Once you’ve submitted the query, your response should look like this:
{'statusCode': 200, 'body': 'export/ea_atrisk_2_2023-09-12_13-57-06.json'}

The status code indicates a successful query, and the body indicates the name of the json file in your extract S3 bucket which contains the cohort information. The name comprises of the attribute, the threshold level and the time the export was made. Go ahead and navigate to the S3 bucket, find the file, and download it to see what Cohort Modeler has found for you.

Troubleshooting

Installing the Game Tech Cohort Modeler

  • Error: Could not find public.ecr.aws/sam/build-python3.8:latest-x86_64 image locally and failed to pull it from docker
    • Try: docker logout public.ecr.aws.
    • Attempt to pull the docker image locally first: docker pull public.ecr.aws/sam/build-python3.8:latest-x86_64
  • Error: RDS does not support creating a DB instance with the following combination:DBInstanceClass=db.r4.large, Engine=neptune, EngineVersion=1.2.0.2, LicenseModel=amazon-license.
    • The default option r4 family was offered when Neptune was launched in 2018, but now newer instance types offer much better price/performance. As of engine version 1.1.0.0, Neptune no longer supports r4 instance types.
    • Therefore, we recommend choosing another Neptune instance based on your needs, as detailed on this page.
      • For testing and development, you can consider the t3.medium and t4g.medium instances, which are eligible for Neptune free-tier offer.
      • Remember to add the instance type that you want to use in the AllowedValues attributes of the DBInstanceClass and rebuilt using sam build –use-container

Using the data gen script (for automated data generation)

  • The cohort modeler deployment does not deploy the CohortModelerGraphGenerator.ipynb which is required for dummy data generation as a default.
  • You will need to login to your Sagemaker instance and upload the  CohortModelerGraphGenerator.ipynb file and run through the cells to generate the dummy data into your S3 bucket.
  • Finally, you’ll need to follow the instructions in this page to load the dummy data from Amazon S3 into your Neptune instance.
    • For the IAM role for Amazon Neptune to load data from Amazon S3, the stack should have created a role with the name Cohort-neptune-iam-role-gametech-modeler.
    • You can run the requests script from your jupyter notebook instance, since it already has access to the Amazon Neptune endpoint. The python script should look like below:
import requests
import json

url = 'https://<NeptuneEndpointURL>:8182/loader'

headers = {
    'Content-Type': 'application/json'
}

data = {
    "source": "<S3FileURI>",
    "format": "csv",
    "iamRoleArn": "NeptuneIAMRoleARN",
    "region": "us-east-1",
    "failOnError": "FALSE",
    "parallelism": "MEDIUM",
    "updateSingleCardinalityProperties": "FALSE",
    "queueRequest": "TRUE"
}

response = requests.post(url, headers=headers, data=json.dumps(data))

print(response.text)

    • Remember to replace the NeptuneEndpointURL, S3FileURI, and NeptuneIAMRoleARN.
    • Remember to load user_vertices.csv, campaign_vertices.csv, action_vertices.csv, interaction_edges.csv, engagement_edges.csv, campaign_edges.csv, and campaign_bidirectional_edges.csv in that order.

Conclusion

In this post, you’ve extended the Cohort Modeler to respond to requests for cohort data, by both querying the cohort database and providing an extract in an S3 bucket for future use. In the next post, we’ll demonstrate how creating this file triggers an automated process. This process will identify the players from the cohort in the studio’s database, extract their contact and other personalization data, compiling the data into a CSV file from that request, and import that file into Pinpoint for targeted messaging.

Related Content

About the Authors

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Brett Ezell

Brett Ezell

Brett Ezell is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. As a Navy veteran, he joined AWS in 2020 through an AWS technical military apprenticeship program. When he isn’t deep diving into solutions for customer challenges, Brett spends his time collecting vinyl, attending live music, and training at the gym. An admitted comic book nerd, he feeds his addiction every Wednesday by combing through his local shop for new books.

AWS Graviton4 is an Even Bigger Arm Server Processor and Tranium2 for AI

Post Syndicated from Cliff Robinson original https://www.servethehome.com/aws-graviton4-is-an-even-bigger-arm-server-processor-and-tranium2-for-ai-nvidia/

Today AWS made the much-anticipated announcement of Graviton4 which should be available in 2024. This is AWS’s latest Graviton processor and the fourth generation launched in the last five years. The company also announced its second-generation Tranium2 processor for AI workloads. AWS Graviton4 is an Even Bigger Arm Server Processor AWS is continuing on its […]

The post AWS Graviton4 is an Even Bigger Arm Server Processor and Tranium2 for AI appeared first on ServeTheHome.

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

Post Syndicated from Pauline Logan original https://blog.rapid7.com/2023/11/28/updates-to-layered-context-enable-teams-to-quickly-understand-which-risk-signals-are-most-pressing/

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

Layered Context introduced a consolidated view of all security risks insightCloudSec collects from the various layers of a cloud environment. This enabled our customers to go from visibility into individual security risks on a resource, to understanding all of the risks that impacted that resource and the overall risk of that resource.

For example: let’s take a cloud resource that has a port left open to the public.

With this level of detail it is pretty challenging to identify the risk level, because we don’t know enough about the resource in question, or even if it was supposed to be opened to the public or not. It’s not that this isn’t risky, we just need to know more to evaluate just how risky it is. As we add more context, we start to get a clearer picture: the environment the resource is running in, if it is connected to a business critical application, does it have any known vulnerabilities, are there identities with elevated permissions associated with the resource, etc.

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

By layering together all of this context, customers are able to effectively understand the actual risk associated with each and every one of their resources – in real-time. This is of course helpful information to have in one consolidated view, but even still it can be difficult to sift through potentially thousands of resources and prioritize the work that needs to be done to secure each one. To that end, we are excited to introduce a new risk score in Layered Context, which analyzes all the signals and context we know about a given cloud resource and automatically assigns a score and a severity, making it easy for our customers to understand the riskiest resources they should focus on.

Prioritizing Risk By Focusing on Toxic Combinations

Much like Layered Context itself, the new risk score combines a variety of risk signals, assigning a higher risk score to resources that suffer from toxic combinations, or multiple risk vectors that compound to present an increased likelihood or impact of compromise.

The risk score takes into account:

  • Business Criticality, with an understanding of what applications the resource is associated with such as a crown-jewel or revenue generating app
  • Public Accessibility, both from a network perspective as well as via user permissions (more on that in a second)
  • Potential Attack Paths, to understand how a bad actor could move laterally across your inter-connected environment
  • Identity-related risk, including excessive and/or unused permissions and privileges
  • Misconfigurations, including whether or not the resource is in compliance with organizational standards
  • Threats to factor in any malicious behavior that has been detected
  • And of course, Vulnerabilities, using Rapid7’s Active Risk model which consumes data on exploitability and active exploitation in the wild

By identifying these toxic combinations, we can ensure the riskiest resources are given the highest priority. Each resource is assigned a score and a severity, making it easy for our customers to see where the riskiest resources exist in their environment and where to focus.

A Clear Understanding of How We Calculate Risk

Alongside our risk score, we are  introducing a new view to breakdown all of the reasons why a resource has been scored accordingly. This will give an overview of the most important information our customers need to know that clearly summarizes the factors that influenced the risk scoring. Reducing the time required to understand why a resource is risky, meaning security teams can focus on remediating the risks.

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

A Bit More on How we Determine Public Accessibility

As mentioned previously, the basis of much of our risk calculation in cloud resources stems from a simple question: “is this resource publicly accessible?” This is a critical detail in determining relative risk, but can be very difficult to ascertain given the complex and ephemeral nature of cloud environments. To address this, we’ve invested significant time and effort to ensure we’re assessing public accessibility as accurately as possible but also explaining why we’ve determined it that way, so it’s much easier to take remediation action. This determination can easily be viewed on a per resource basis from the Layered Context page.

We have lots of exciting releases coming up in the next few months, alongside Risk scoring we are also extending our Attack Path Analysis feature to show the Blast Radius of an Attack with improved topology visualizations.  This will give our customers not only the visibility into how an attacker could exploit a given resource but also the potential for lateral movement between interconnected resources. Additionally, we’ll be updating the way we validate and show proof of public accessibility. Should a resource be publicly accessible, you will be able to easily view the proof details which will show exactly which combination of configurations is resulting in the resource being publicly accessible.

The new risk scoring capabilities in Layered Context will be on display at AWS Re:Invent next week. Be sure to stop by booth #1270 to see it in action!

Simplify your SMS setup with the new Amazon Pinpoint SMS console

Post Syndicated from hamzarau original https://aws.amazon.com/blogs/messaging-and-targeting/send-sms-using-the-new-amazon-pinpoint-sms-console/

Amazon Pinpoint is a multichannel communication service that helps application developers engage their customers through communication channels such as SMS or text messaging, email, mobile push, voice, and in-app messaging.

Amazon Pinpoint SMS provides the global scale, resiliency, and flexibility required to deliver SMS and voice messaging in web, mobile, or business applications. SMS messaging is used for use cases like one-time passcode validation, time sensitive alerts, and two-way chat due to its global reach and ubiquity. Today Amazon Pinpoint SMS sends messages to over 240 countries and regions. In this post, we will review how to use the new Pinpoint SMS management console to get your SMS resources setup correctly the first time.

This blog walks through the setup and configuration steps for Pinpoint SMS using the management console. Additionally, all setup and configurations can also be completed using Pinpoint SMS APIs. For more information visit the Pinpoint SMS documentation, or complete the Amazon Pinpoint SMS workshop.

The Pinpoint SMS management console provides control for the existing functionality of the Pinpoint SMS APIs to create, and manage your SMS and voice resources. In addition, the Pinpoint SMS console has a Quick start – SMS setup guide or Request originator flow to guide you through the setup process and for requesting and managing your SMS resources.

If you require additional background on how SMS works using Amazon Pinpoint SMS, refer to How to Manage Global Sending of SMS with Amazon Pinpoint. Below are some important SMS concepts we’ll highlight in this blog post.

Important SMS Concepts and Resources

  • Phone pool: The phone pool resource is a collection of phone numbers and sender IDs that all share the same settings and provide failover if a number becomes unavailable.
  • Originator: An originator refers to either a phone number or sender ID.
  • Phone number: Also called originator number, a phone number is a numeric string of numbers that identifies the sender. This can be a long code, short code, toll-free number (TFN), or 10-digit long code (10DLC). For more information see choosing a phone number or sender ID.
  • Verified destination phone number: When your account is in Sandbox you can only send SMS messages to phone numbers that have gone through the verification process. The phone number receives an SMS message with a verification code. The received code must be entered into the console to complete the process.
  • Simulator phone number: A simulator phone number behaves as any other origination and destination phone number without sending the SMS message to mobile carriers. Simulator phone numbers do not require registration and are used for testing scenarios.
  • Sender ID: Also called originator ID, a sender ID is an alphanumeric string that identifies the sender. For more information see choosing a phone number or sender ID.
  • Registered phone number: Some countries require you to register your company’s identity before you can purchase phone numbers or sender IDs. They also require a review of the messages that you send to recipients in their country. Registrations are processed by external third parties, so the amount of time to process a registration varies by phone number type and country. After all required registrations are complete, the status of your phone numbers changes to Active and is available for use. For more information about which countries require registration see, supported countries and regions (SMS channel).

Getting started

Sign-in to the AWS management console and search for Amazon Pinpoint. If you don’t have an existing AWS account, complete the following steps to create one.

In the Amazon Pinpoint console, you can choose between managing Pinpoint SMS and Pinpoint campaign orchestration. Pinpoint SMS is the place where applications developers go to setup and configure their associated resources for SMS sending through any AWS service. Pinpoint campaign orchestration is for builders who want to manage their customer segments and send messages using campaigns, or multi-step journeys. Campaign orchestration utilizes communication channels like Pinpoint SMS or Amazon SES (simple email service) to deliver its messages. In this blog, we will discuss how to configure Pinpoint SMS using its management console.

Amazon Pinpoint SMS Console

Quick start – SMS setup guide

Once you’ve selected the Amazon Pinpoint SMS console, you will land on the Overview page. On this page, you get a summary of your SMS resources and the Quick start – SMS setup guide. This guide will walk you through creating the appropriate SMS resources to start sending SMS messages. The steps outlined in the Quick start guide are recommended but not required.

Step 1: Create a phone pool

A phone pool is a collection of phone numbers and sender IDs that all share the same settings and provide failover if a number becomes unavailable. Phone pools provide the benefit of managing for number resiliency, removes the complexity from sending applications, and provides a logical grouping to manage phone numbers and sender IDs. For example, phone pools can be grouped by use-case such as having a phone pool for OTP (one-time password) messages.

In the navigation pane, under Overview, in the Quick start section, choose Create pool. Under the pool setup section, enter a name for your pool in Pool name. To create a pool, you will need to select an origination identity, either a phone number or sender ID to associate with the pool. Additional origination identities can be added once the pool is created on the Phone pools page. If you don’t have an active phone number or sender ID in your account, we recommend selecting a simulator number, which can be used for testing and does not require any registration. Once you’ve selected an origination identity, you can choose Create phone pool to complete step 1.

Setting up phone pools for sending SMS

Step 2: Create a configuration set

A configuration set is a set of rules that are applied when you send a message. For example, a configuration set can specify a destination for events related to a message. When SMS events occur (such as delivery or failure events), they are routed to the destination associated with the configuration set that you specified when you sent the message. You’re not required to use configuration sets when you send messages, but we recommend that you do. We support sending SMS and voice events to Amazon CloudWatch, Amazon Kinesis DataFirehose, and Amazon SNS.

In the navigation pane, under Overview, in the Quick start section, choose Create set. Under the Configuration set details section, enter a name in Configuration set name. For Event Destination setup, choose either the quick start option to create a Cloud formation stack to automatically create and configure CloudWatch, Kinesis DataFirehose, and SNS to log all events or the advanced option to manually select which event destinations you would like to setup. Once you’ve made the selection, choose Create Configuration set to complete step 2.

How to create a configuration set for sending SMS

Step 3: Test SMS sending

Send a test message using the SMS simulator. Select an originator to send from, and a destination number to send to. To track the status of your message, add a configuration set to publish SMS events.

In the navigation pane, under Overview, in the Quick start section, choose Test SMS sending. Under the Originator section, select either a phone pool, phone number, or sender ID in your account to send test messages from. Next, under the Destination phone number section, select either a simulator number or active destination number to send test messages to. If your account is in Sandbox, you can only send messages to simulator numbers or verified destination numbers. Once your account is in Production you can send messages to simulator numbers or any active destination number. You can (optionally) select a configuration set to track your SMS events. Next, under the Message body section, enter a sample message and send the test message.

Note – If you are sending from a US simulator number (or using a phone pool that only contains a US simulator number) you can only send messages to US simulator destination numbers. A simulator phone number behaves like any other phone number without sending the SMS message to mobile carriers.

SMS simulator in the SMS console

Step 4: Request production Access

Finally, if your account is in Sandbox there are limits to the amount you can spend and can only send to verified destination phone numbers. Request moving your account from Sandbox to Production to remove these limits. To move to Production, open a case with AWS Support Center.

Conclusion

After following the request for Production access, you’ve completed the recommended steps to get your account configuration setup. You have now tested and configured the following resources in your account:

  • Phone pool: A phone pool is a collection of phone numbers and sender IDs that all share the same settings and provide failover if a number becomes unavailable. Phone pools provide the benefit of managing for number resiliency, removes the complexity from sending applications, and provides a logical grouping to manage phone numbers and sender IDs.
    • Originator: As part of the pool setup, you are required to associate at least one originator to the phone pool. An originator refers to either a phone number or sender ID. If you’ve selected a simulator number and would like to now request a new phone number or sender ID, you can do so following Request originator flow.
  • Configuration set: A configuration set allows you to organize, track, and configure logging of your SMS events, specifying where to publish them by adding event destinations.

Next steps

To request additional originators such as phone numbers or sender IDs, you can follow the Request Originator flow in the management console. If your originator requires registrations and is supported, you can self-service the phone number or sender ID registration in the management console.

An Overview of Bulk Sender Changes at Yahoo/Gmail

Post Syndicated from Dustin Taylor original https://aws.amazon.com/blogs/messaging-and-targeting/an-overview-of-bulk-sender-changes-at-yahoo-gmail/

In a move to safeguard user inboxes, Gmail and Yahoo Mail announced a new set of requirements for senders effective from February 2024. Let’s delve into the specifics and what Amazon Simple Email Service (Amazon SES) customers need to do to comply with these requirements.

What are the new email sender requirements?

The new requirements include long-standing best practices that all email senders should adhere to in order to achieve good deliverability with mailbox providers. What’s new is that Gmail, Yahoo Mail, and other mailbox providers will require alignment with these best practices for those who send bulk messages over 5000 per day or if a significant number of recipients indicate the mail as spam.

The requirements can be distilled into 3 categories: 1) stricter adherence to domain authentication, 2) give recipients an easy way to unsubscribe from bulk mail, and 3) monitoring spam complaint rates and keeping them under a 0.3% threshold.

* This blog was originally published in November 2023, and updated on January 12, 2024 to clarify timelines, and to provide links to additional resources.

1. Domain authentication

Mailbox providers will require domain-aligned authentication with DKIM and SPF, and they will be enforcing DMARC policies for the domain used in the From header of messages. For example, gmail.com will be publishing a quarantine DMARC policy, which means that unauthorized messages claiming to be from Gmail will be sent to Junk folders.

Read Amazon SES: Email Authentication and Getting Value out of Your DMARC Policy to gain a deeper understanding of SPF and DKIM domain-alignment and maximize the value from your domain’s DMARC policy.

The following steps outline how Amazon SES customers can adhere to the domain authentication requirements:

Adopt domain identities: Amazon SES customers who currently rely primarily on email address identities will need to adopt verified domain identities to achieve better deliverability with mailbox providers. By using a verified domain identity with SES, your messages will have a domain-aligned DKIM signature.

Not sure what domain to use? Read Choosing the Right Domain for Optimal Deliverability with Amazon SES for additional best practice guidance regarding sending authenticated email. 

Configure a Custom MAIL FROM domain: To further align with best practices, SES customers should also configure a custom MAIL FROM domain so that SPF is domain-aligned.

The table below illustrates the three scenarios based on the type of identity you use with Amazon SES

Scenarios using example.com in the From header DKIM authenticated identifier SPF authenticated identifier DMARC authentication results
[email protected] as a verified email address identity amazonses.com email.amazonses.com Fail – DMARC analysis fails as the sending domain does not have a DKIM signature or SPF record that matches.
example.com as a verified domain identity example.com email.amazonses.com Success – DKIM signature aligns with sending domain which will cause DMARC checks to pass.
example.com as a verified domain identity, and bounce.example.com as a custom MAIL FROM domain example.com bounce.example.com Success – DKIM and SPF are aligned with sending domain.

Figure 1: Three scenarios based on the type of identity used with Amazon SES. Using a verified domain identity and configuring a custom MAIL FROM domain will result in both DKIM and SPF being aligned to the From header domain’s DMARC policy.

Be strategic with subdomains: Amazon SES customers should consider a strategic approach to the domains and subdomains used in the From header for different email sending use cases. For example, use the marketing.example.com verified domain identity for sending marketing mail, and use the receipts.example.com verified domain identity to send transactional mail.

Why? Marketing messages may have higher spam complaint rates and would need to adhere to the bulk sender requirements, but transactional mail, such as purchase receipts, would not necessarily have spam complaints high enough to be classified as bulk mail.

Publish DMARC policies: Publish a DMARC policy for your domain(s). The domain you use in the From header of messages needs to have a policy by setting the p= tag in the domain’s DMARC policy in DNS. The policy can be set to “p=none” to adhere to the bulk sending requirements and can later be changed to quarantine or reject when you have ensured all email using the domain is authenticated with DKIM or SPF domain-aligned authenticated identifiers.

2. Set up an easy unsubscribe for email recipients

Bulk senders are expected to include a mechanism to unsubscribe by adding an easy to find link within the message. The February 2024 mailbox provider rules will require senders to additionally add one-click unsubscribe headers as defined by RFC 2369 and RFC 8058. These headers make it easier for recipients to unsubscribe, which reduces the rate at which recipients will complain by marking messages as spam.

There are many factors that could result in your messages being classified as bulk by any mailbox provider. Volume over 5000 per day is one factor, but the primary factor that mailbox providers use is in whether the recipient actually wants to receive the mail.

If you aren’t sure if your mail is considered bulk, monitor your spam complaint rates. If the complaint rates are high or growing, it is a sign that you should offer an easy way for recipients to unsubscribe.

How to adhere to the easy unsubscribe requirement

The following steps outline how Amazon SES customers can adhere to the easy unsubscribe requirement:

Add one-click unsubscribe headers to the messages you send: Amazon SES customers sending bulk or potentially unwanted messages will need to implement an easy way for recipients to unsubscribe, which they can do using the SES subscription management feature.

Mailbox providers are requiring that large senders give recipients the ability to unsubscribe from bulk email in one click using the one-click unsubscribe header, however it is acceptable for the unsubscribe link in the message to direct the recipient to a landing page for the recipient to confirm their opt-out preferences.

To set up one-click unsubscribe without using the SES subscription management feature, include both of these headers in outgoing messages:

  • List-Unsubscribe-Post: List-Unsubscribe=One-Click
  • List-Unsubscribe: <https://example.com/unsubscribe/example>

When a recipient unsubscribes using one-click, you receive this POST request:

POST /unsubscribe/example HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 26
List-Unsubscribe=One-Click

Gmail’s FAQ and Yahoo’s FAQ both clarify that the one-click unsubscribe requirement will not be enforced until June 2024 as long as the bulk sender has a functional unsubscribe link clearly visible in the footer of each message.

Honor unsubscribe requests within 2 days: Verify that your unsubscribe process immediately removes the recipient from receiving similar future messages. Mailbox providers are requiring that bulk senders give recipients the ability to unsubscribe from email in one click, and that the senders process unsubscribe requests within two days.

If you adopt the SES subscription management feature, make sure you integrate the recipient opt-out preferences with the source of your email sending lists. If you implement your own one-click unsubscribe (for example, using Amazon API Gateway and an AWS Lambda function), make sure it designed to suppress sending to email addresses in your source email lists.

Review your email list building practices: Ensure responsible email practices by refraining from purchasing email lists, safeguarding opt-in forms from bot abuse, verifying recipients’ preferences through confirmation messages, and abstaining from automatically enrolling recipients in categories that were not requested.

Having good list opt-in hygiene is the best way to ensure that you don’t have high spam complaint rates before you adhere to the new required best practices. To learn more, read What is a Spam Trap, and Why You Should Care.

3. Monitor spam rates

Mailbox providers will require that all senders keep spam complaint rates below 0.3% to avoid having their email treated as spam by the mailbox provider. The following steps outline how Amazon SES customers can meet the spam complaint rate requirement:

Enroll with Google Postmaster Tools: Amazon SES customers should enroll with Google Postmaster Tools to monitor their spam complaint rates for Gmail recipients.

Gmail recommends spam complaint rates stay below 0.1%. If you send to a mix of Gmail recipients and recipients on other mailbox providers, the spam complaint rates reported by Gmail’s Postmaster Tools are a good indicator of your spam complaint rates at mailbox providers who don’t let you view metrics.

Enable Amazon SES Virtual Deliverability Manager: Enable Virtual Deliverability Manager (VDM) in your Amazon SES account. Customers can use VDM to monitor bounce and complaint rates for many mailbox providers. Amazon SES recommends customers to monitor reputation metrics and stay below a 0.1% complaint rate.

Segregate and secure your sending using configuration sets: In addition to segregating sending use cases by domain, Amazon SES customers should use configuration sets for each sending use case.

Using configuration sets will allow you to monitor your sending activity and implement restrictions with more granularity. You can even pause the sending of a configuration set automatically if spam complaint rates exceed your tolerance threshold.

Conclusion

These changes are planned for February 2024, but be aware that the exact timing and methods used by each mailbox provider may vary. If you experience any deliverability issues with any mailbox provider prior to February, it is in your best interest to adhere to these required best practices as a first step.

We hope that this blog clarifies any areas of confusion on this change and provides you with the information you need to be prepared for February 2024. Happy sending!

Helpful links:

Transforming transactions: Streamlining PCI compliance using AWS serverless architecture

Post Syndicated from Abdul Javid original https://aws.amazon.com/blogs/security/transforming-transactions-streamlining-pci-compliance-using-aws-serverless-architecture/

Compliance with the Payment Card Industry Data Security Standard (PCI DSS) is critical for organizations that handle cardholder data. Achieving and maintaining PCI DSS compliance can be a complex and challenging endeavor. Serverless technology has transformed application development, offering agility, performance, cost, and security.

In this blog post, we examine the benefits of using AWS serverless services and highlight how you can use them to help align with your PCI DSS compliance responsibilities. You can remove additional undifferentiated compliance heavy lifting by building modern applications with abstracted AWS services. We review an example payment application and workflow that uses AWS serverless services and showcases the potential reduction in effort and responsibility that a serverless architecture could provide to help align with your compliance requirements. We present the review through the lens of a merchant that has an ecommerce website and include key topics such as access control, data encryption, monitoring, and auditing—all within the context of the example payment application. We don’t discuss additional service provider requirements from the PCI DSS in this post.

This example will help you navigate the intricate landscape of PCI DSS compliance. This can help you focus on building robust and secure payment solutions without getting lost in the complexities of compliance. This can also help reduce your compliance burden and empower you to develop your own secure, scalable applications. Join us in this journey as we explore how AWS serverless services can help you meet your PCI DSS compliance objectives.

Disclaimer

This document is provided for the purposes of information only; it is not legal advice, and should not be relied on as legal advice. Customers are responsible for making their own independent assessment of the information in this document. This document: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

AWS encourages its customers to obtain appropriate advice on their implementation of privacy and data protection environments, and more generally, applicable laws and other obligations relevant to their business.

PCI DSS v4.0 and serverless

In April 2022, the Payment Card Industry Security Standards Council (PCI SSC) updated the security payment standard to “address emerging threats and technologies and enable innovative methods to combat new threats.” Two of the high-level goals of these updates are enhancing validation methods and procedures and promoting security as a continuous process. Adopting serverless architectures can help meet some of the new and updated requirements in version 4.0, such as enhanced software and encryption inventories. If a customer has access to change a configuration, it’s the customer’s responsibility to verify that the configuration meets PCI DSS requirements. There are more than 20 PCI DSS requirements applicable to Amazon Elastic Compute Cloud (Amazon EC2). To fulfill these requirements, customer organizations must implement controls such as file integrity monitoring, operating system level access management, system logging, and asset inventories. Using AWS abstracted services in this scenario can remove undifferentiated heavy lifting from your environment. With abstracted AWS services, because there is no operating system to manage, AWS becomes responsible for maintaining consistent time settings for an abstracted service to meet Requirement 10.6. This will also shift your compliance focus more towards your application code and data.

This makes more of your PCI DSS responsibility addressable through the AWS PCI DSS Attestation of Compliance (AOC) and Responsibility Summary. This attestation package is available to AWS customers through AWS Artifact.

Reduction in compliance burden

You can use three common architectural patterns within AWS to design payment applications and meet PCI DSS requirements: infrastructure, containerized, and abstracted. We look into EC2 instance-based architecture (infrastructure or containerized patterns) and modernized architectures using serverless services (abstracted patterns). While both approaches can help align with PCI DSS requirements, there are notable differences in how they handle certain elements. EC2 instances provide more control and flexibility over the underlying infrastructure and operating system, assisting you in customizing security measures based on your organization’s operational and security requirements. However, this also means that you bear more responsibility for configuring and maintaining security controls applicable to the operating systems, such as network security controls, patching, file integrity monitoring, and vulnerability scanning.

On the other hand, serverless architectures similar to the preceding example can reduce much of the infrastructure management requirements. This can relieve you, the application owner or cloud service consumer, of the burden of configuring and securing those underlying virtual servers. This can streamline meeting certain PCI requirements, such as file integrity monitoring, patch management, and vulnerability management, because AWS handles these responsibilities.

Using serverless architecture on AWS can significantly reduce the PCI compliance burden. Approximately 43 percent of the overall PCI compliance requirements, encompassing both technical and non-technical tests, are addressed by the AWS PCI DSS Attestation of Compliance.

Customer responsible
52%
AWS responsible
43%
N/A
5%

The following table provides an analysis of each PCI DSS requirement against the serverless architecture in Figure 1, which shows a sample payment application workflow. You must evaluate your own use and secure configuration of AWS workload and architectures for a successful audit.

PCI DSS 4.0 requirements Test cases Customer responsible AWS responsible N/A
Requirement 1: Install and maintain network security controls 35 13 22 0
Requirement 2: Apply secure configurations to all system components 27 16 11 0
Requirement 3: Protect stored account data 55 24 29 2
Requirement 4: Protect cardholder data with strong cryptography during transmission over open, public networks 12 7 5 0
Requirement 5: Protect all systems and networks from malicious software 25 4 21 0
Requirement 6: Develop and maintain secure systems and software 35 31 4 0
Requirement 7: Restrict access to system components and cardholder data by business need-to-know 22 19 3 0
Requirement 8: Identify users and authenticate access to system components 52 43 6 3
Requirement 9: Restrict physical access to cardholder data 56 3 53 0
Requirement 10: Log and monitor all access to system components and cardholder data 38 17 19 2
Requirement 11: Test security of systems and networks regularly 51 22 23 6
Requirement 12: Support information security with organizational policies 56 44 2 10
Total 464 243 198 23
Percentage 52% 43% 5%

Note: The preceding table is based on the example reference architecture that follows. The actual extent of PCI DSS requirements reduction can vary significantly depending on your cardholder data environment (CDE) scope, implementation, and configurations.

Sample payment application and workflow

This example serverless payment application and workflow in Figure 1 consists of several interconnected steps, each using different AWS services. The steps are listed in the following text and include brief descriptions. They cover two use cases within this example application — consumers making a payment and a business analyst generating a report.

The example outlines a basic serverless payment application workflow using AWS serverless services. However, it’s important to note that the actual implementation and behavior of the workflow may vary based on specific configurations, dependencies, and external factors. The example serves as a general guide and may require adjustments to suit the unique requirements of your application or infrastructure.

Several factors, including but not limited to, AWS service configurations, network settings, security policies, and third-party integrations, can influence the behavior of the system. Before deploying a similar solution in a production environment, we recommend thoroughly reviewing and adapting the example to align with your specific use case and requirements.

Keep in mind that AWS services and features may evolve over time, and new updates or changes may impact the behavior of the components described in this example. Regularly consult the AWS documentation and ensure that your configurations adhere to best practices and compliance standards.

This example is intended to provide a starting point and should be considered as a reference rather than an exhaustive solution. Always conduct thorough testing and validation in your specific environment to ensure the desired functionality and security.

Figure 1: Serverless payment architecture and workflow

Figure 1: Serverless payment architecture and workflow

  • Use case 1: Consumers make a payment
    1. Consumers visit the e-commerce payment page to make a payment.
    2. The request is routed to the payment application’s domain using Amazon Route 53, which acts as a DNS service.
    3. The payment page is protected by AWS WAF to inspect the initial incoming request for any malicious patterns, web-based attacks (such as cross-site scripting (XSS) attacks), and unwanted bots.
    4. An HTTPS GET request (over TLS) is sent to the public target IP. Amazon CloudFront, a content delivery network (CDN), acts as a front-end proxy and caches and fetches static content from an Amazon Simple Storage Service (Amazon S3) bucket.
    5. AWS WAF inspects the incoming request for any malicious patterns, if the request is blocked, the request doesn’t return static content from the S3 bucket.
    6. User authentication and authorization are handled by Amazon Cognito, providing a secure login and scalable customer identity and access management system (CIAM)
    7. AWS WAF processes the request to protect against web exploits, then Amazon API Gateway forwards it to the payment application API endpoint.
    8. API Gateway launches AWS Lambda functions to handle payment requests. AWS Step Functions state machine oversees the entire process, directing the running of multiple Lambda functions to communicate with the payment processor, initiate the payment transaction, and process the response.
    9. The cardholder data (CHD) is temporarily cached in Amazon DynamoDB for troubleshooting and retry attempts in the event of transaction failures.
    10. A Lambda function validates the transaction details and performs necessary checks against the data stored in DynamoDB. A web notification is sent to the consumer for any invalid data.
    11. A Lambda function calculates the transaction fees.
    12. A Lambda function authenticates the transaction and initiates the payment transaction with the third-party payment provider.
    13. A Lambda function is initiated when a payment transaction with the third-party payment provider is completed. It receives the transaction status from the provider and performs multiple actions.
    14. Consumers receive real-time notifications through a web browser and email. The notifications are initiated by a step function, such as order confirmations or payment receipts, and can be integrated with external payment processors through an Amazon Simple Notification Service (Amazon SNS) Amazon Simple Email Service (Amazon SES) web hook.
    15. A separate Lambda function clears the DynamoDB cache.
    16. The Lambda function makes entries into the Amazon Simple Queue Service (Amazon SQS) dead-letter queue for failed transactions to retry at a later time.
  • Use case 2: An admin or analyst generates the report for non-PCI data
    1. An admin accesses the web-based reporting dashboard using their browser to generate a report.
    2. The request is routed to AWS WAF to verify the source that initiated the request.
    3. An HTTPS GET request (over TLS) is sent to the public target IP. CloudFront fetches static content from an S3 bucket.
    4. AWS WAF inspects incoming requests for any malicious patterns, if the request is blocked, the request doesn’t return static content from the S3 bucket. The validated traffic is sent to Amazon S3 to retrieve the reporting page.
    5. The backend requests of the reporting page pass through AWS WAF again to provide protection against common web exploits before being forwarded to the reporting API endpoint through API Gateway.
    6. API Gateway launches a Lambda function for report generation. The Lambda function retrieves data from DynamoDB storage for the reporting mechanism.
    7. The AWS Security Token Service (AWS STS) issues temporary credentials to the Lambda service in the non-PCI serverless account, allowing it to launch the Lambda function in the PCI serverless account. The Lambda function retrieves non-PCI data and writes it into DynamoDB.
    8. The Lambda function fetches the non-PCI data based on the report criteria from the DynamoDB table from the same account.

Additional AWS security and governance services that would be implemented throughout the architecture are shown in Figure 1, Label-25. For example, Amazon CloudWatch monitors and alerts on all the Lambda functions within the environment.

Label-26 demonstrates frameworks that can be used to build the serverless applications.

Scoping and requirements

Now that we’ve established the reference architecture and workflow, lets delve into how it aligns with PCI DSS scope and requirements.

PCI scoping

Serverless services are inherently segmented by AWS, but they can be used within the context of an AWS account hierarchy to provide various levels of isolation as described in the reference architecture example.

Segregating PCI data and non-PCI data into separate AWS accounts can help in de-scoping non-PCI environments and reducing the complexity and audit requirements for components that don’t handle cardholder data.

PCI serverless production account

  • This AWS account is dedicated to handling PCI data and applications that directly process, transmit, or store cardholder data.
  • Services such as Amazon Cognito, DynamoDB, API Gateway, CloudFront, Amazon SNS, Amazon SES, Amazon SQS, and Step Functions are provisioned in this account to support the PCI data workflow.
  • Security controls, logging, monitoring, and access controls in this account are specifically designed to meet PCI DSS requirements.

Non-PCI serverless production account

  • This separate AWS account is used to host applications that don’t handle PCI data.
  • Since this account doesn’t handle cardholder data, the scope of PCI DSS compliance is reduced, simplifying the compliance process.

Note: You can use AWS Organizations to centrally manage multiple AWS accounts.

AWS IAM Identity Center (successor to AWS Single Sign-On) is used to manage user access to each account and is integrated with your existing identify provider. This helps to ensure you’re meeting PCI requirements on identity, access control of card holder data, and environment.

Now, let’s look at the PCI DSS requirements that this architectural pattern can help address.

Requirement 1: Install and maintain network security controls

  • Network security controls are limited to AWS Identity and Access Management (IAM) and application permissions because there is no customer controlled or defined network. VPC-centric requirements aren’t applicable because there is no VPC. The configuration settings for serverless services can be covered under Requirement 6 to for secure configuration standards. This supports compliance with Requirements 1.2 and 1.3.

Requirement 2: Apply secure configurations to all system components

  • AWS services are single function by default and exist with only the necessary functionality enabled for the functioning of that service. This supports compliance with much of Requirement 2.2.
  • Access to AWS services is considered non-console and only accessible through HTTPS through the service API. This supports compliance with Requirement 2.2.7.
  • The wireless requirements under Requirement 2.3 are not applicable, because wireless environments don’t exist in AWS environments.

Requirement 3: Protect stored account data

  • AWS is responsible for destruction of account data configured for deletion based on DynamoDB Time to Live (TTL) values. This supports compliance with Requirement 3.2.
  • DynamoDB and Amazon S3 offer secure storage of account data, encryption by default in transit and at rest, and integration with AWS Key Management Service (AWS KMS). This supports compliance with Requirements 3.5 and 4.2.
  • AWS is responsible for the generation, distribution, storage, rotation, destruction, and overall protection of encryption keys within AWS KMS. This supports compliance with Requirements 3.6 and 3.7.
  • Manual cleartext cryptographic keys aren’t available in this solution, Requirement 3.7.6 is not applicable.

Requirement 4: Protect cardholder data with strong cryptography during transmission over open, public networks

  • AWS Certificate Manager (ACM) integrates with API Gateway and enables the use of trusted certificates and HTTPS (TLS) for secure communication between clients and the API. This supports compliance with Requirement 4.2.
  • Requirement 4.2.1.2 is not applicable because there are no wireless technologies in use in this solution. Customers are responsible for ensuring strong cryptography exists for authentication and transmission over other wireless networks they manage outside of AWS.
  • Requirement 4.2.2 is not applicable because no end-user technologies exist in this solution. Customers are responsible for ensuring the use of strong cryptography if primary account numbers (PAN) are sent through end-user messaging technologies in other environments.

Requirement 5: Protect a ll systems and networks from malicious software

  • There are no customer-managed compute resources in this example payment environment, Requirements 5.2 and 5.3 are the responsibility of AWS.

Requirement 6: Develop and maintain secure systems and software

  • Amazon Inspector now supports Lambda functions, adding continual, automated vulnerability assessments for serverless compute. This supports compliance with Requirement 6.2.
  • Amazon Inspector helps identify vulnerabilities and security weaknesses in the payment application’s code, dependencies, and configuration. This supports compliance with Requirement 6.3.
  • AWS WAF is designed to protect applications from common attacks, such as SQL injections, cross-site scripting, and other web exploits. AWS WAF can filter and block malicious traffic before it reaches the application. This supports compliance with Requirement 6.4.2.

Requirement 7: Restrict access to system components and cardholder data by business need to know

  • IAM and Amazon Cognito allow for fine-grained role- and job-based permissions and access control. Customers can use these capabilities to configure access following the principles of least privilege and need-to-know. IAM and Cognito support the use of strong identification, authentication, authorization, and multi-factor authentication (MFA). This supports compliance with much of Requirement 7.

Requirement 8: Identify users and authenticate access to system components

  • IAM and Amazon Cognito also support compliance with much of Requirement 8.
  • Some of the controls in this requirement are usually met by the identity provider for internal access to the cardholder data environment (CDE).

Requirement 9: Restrict physical access to cardholder data

  • AWS is responsible for the destruction of data in DynamoDB based on the customer configuration of content TTL values for Requirement 9.4.7. Customers are responsible for ensuring their database instance is configured for appropriate removal of data by enabling TTL on DDB attributes.
  • Requirement 9 is otherwise not applicable for this serverless example environment because there are no physical media, electronic media not already addressed under Requirement 3.2, or hard-copy materials with cardholder data. AWS is responsible for the physical infrastructure under the Shared Responsibility Model.

Requirement 10: Log and monitor all access to system components and cardholder data

  • AWS CloudTrail provides detailed logs of API activity for auditing and monitoring purposes. This supports compliance with Requirement 10.2 and contains all of the events and data elements listed.
  • CloudWatch can be used for monitoring and alerting on system events and performance metrics. This supports compliance with Requirement 10.4.
  • AWS Security Hub provides a comprehensive view of security alerts and compliance status, consolidating findings from various security services, which helps in ongoing security monitoring and testing. Customers must enable PCI DSS security standard, which supports compliance with Requirement 10.4.2.
  • AWS is responsible for maintaining accurate system time for AWS services. In this example, there are no compute resources for which customers can configure time. Requirement 10.6 is addressable through the AWS Attestation of Compliance and Responsibility Summary available in AWS Artifact.

Requirement 11: Regularly test security systems and processes

  • Testing for rogue wireless activity within the AWS-based CDE is the responsibility of AWS. AWS is responsible for the management of the physical infrastructure under Requirement 11.2. Customers are still responsible for wireless testing for their environments outside of AWS, such as where administrative workstations exist.
  • AWS is responsible for internal vulnerability testing of AWS services, and supports compliance with Requirement 11.3.1.
  • Amazon GuardDuty, a threat detection service that continuously monitors for malicious activity and unauthorized access, providing continuous security monitoring. This supports the IDS requirements under Requirement 11.5.1, and covers the entire AWS-based CDE.
  • AWS Config allows customers to catalog, monitor and manage configuration changes for their AWS resources. This supports compliance with Requirement 11.5.2.
  • Customers can use AWS Config to monitor the configuration of the S3 bucket hosting the static website. This supports compliance with Requirement 11.6.1.

Requirement 12: Support information security with organizational policies and programs

  • Customers can download the AWS AOC and Responsibility Summary package from Artifact to support Requirement 12.8.5 and the identification of which PCI DSS requirements are managed by the third-party service provider (TSPS) and which by the customer.

Conclusion

Using AWS serverless services when developing your payment application can significantly help reduce the number of PCI DSS requirements you need to meet by yourself. By offloading infrastructure management to AWS and using serverless services such as Lambda, API Gateway, DynamoDB, Amazon S3, and others, you can benefit from built-in security features and help align with your PCI DSS compliance requirements.

Contact us to help design an architecture that works for your organization. AWS Security Assurance Services is a Payment Card Industry-Qualified Security Assessor company (PCI-QSAC) and HITRUST External Assessor firm. We are a team of industry-certified assessors who help you to achieve, maintain, and automate compliance in the cloud by tying together applicable audit standards to AWS service-specific features and functionality. We help you build on frameworks such as PCI DSS, HITRUST CSF, NIST, SOC 2, HIPAA, ISO 27001, GDPR, and CCPA.

More information on how to build applications using AWS serverless technologies can be found at Serverless on AWS.

Want more AWS Security news? Follow us on Twitter.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Serverless re:Post, Security, Identity, & Compliance re:Post or contact AWS Support.

Abdul Javid

Abdul Javid

Abdul is a Senior Security Assurance Consultant and PCI DSS Qualified Security Assessor with AWS Security Assurance Services, and has more than 25 years of IT governance, operations, security, risk, and compliance experience. Abdul leverages his experience and knowledge to advise AWS customers with guidance and advice on their compliance journey. Abdul earned an M.S. in Computer Science from IIT, Chicago and holds various industry recognized sought after certifications in security and program and risk management from prominent organizations like AWS, HITRUST, ISACA, PMI, PCI DSS, and ISC2.

Ted Tanner

Ted Tanner

Ted is a Principal Assurance Consultant and PCI DSS Qualified Security Assessor with AWS Security Assurance Services, and has more than 25 years of IT and security experience. He uses this experience to provide AWS customers with guidance on compliance and security, and on building and optimizing their cloud compliance programs. He is co-author of the Payment Card Industry Data Security Standard (PCI DSS) v3.2.1 on AWS Compliance Guide and the soon-to-be-released v4.0 edition.

Tristan Watty

Tristan Watty

Dr. Watty is a Senior Security Consultant within the Professional Services team of Amazon Web Services based in Queens, New York. He is a passionate Tech Enthusiast, Influencer, and Amazonian with 15+ years of professional and educational experience with a specialization in Security, Risk, and Compliance. His zeal lies in empowering customers to develop and put into action secure mechanisms that steer them towards achieving their security goals. Dr. Watty also created and hosts an AWS Security Show named “Security SideQuest!” that airs on the AWS Twitch Channel.

Padmakar Bhosale

Padmakar Bhosale

Padmakar is a Sr. Technical Account Manager with over 25 years of experience in the Financial, Banking, and Cloud Services. He provides AWS customers with guidance and advice on Payment Services, Core Banking Ecosystem, Credit Union Banking Technologies, Resiliency on AWS Cloud, AWS Accounts & Network levels PCI Segmentations, and Optimization of the Customer’s Cloud Journey experience on AWS Cloud.

How to prevent SMS Pumping when using Amazon Pinpoint or SNS

Post Syndicated from Akshada Umesh Lalaye original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-prevent-sms-pumping-when-using-amazon-pinpoint-or-sns/

SMS fraud is, unfortunately, a common issue that all senders of SMS encounter as they adopt SMS as a communication channel. This post defines the most common types of fraud and provides concrete guidance on how to mitigate or eliminate each of them.

Introduction to SMS Pumping:

SMS Pumping, also known as an SMS Flood attack, or Artificially Inflated Traffic (AIT), occurs when fraudsters exploit a phone number input field to acquire a one-time passcode (OTP), an app download link, or any other content via SMS. In cases where these input forms lack sufficient security measures, attackers can artificially increase the volume of SMS traffic, thereby exploiting vulnerabilities in your application. The perpetrators dispatch SMS messages to a selection of numbers under the jurisdiction of a particular mobile network operator (MNO), ultimately receiving a portion of the resulting revenue. It is essential to understand how to detect these attacks and prevent them.

Common Evidence of SMS Pumping:

  • Dramatic Decrease in Conversion Rates: A common SMS use case is for identity verification through the use of One Time Passwords (OTP) but this could also be seen in other types of use cases where a clear and consistent conversion rate is seen. A drop in a normally stable conversion rate may be caused by an increase in volume that will never convert and can indicate an issue that requires investigation. Setting up an alert for anomalies in conversion rates is always a good practice.
  • SMS Requests or Deliveries from Unknown Countries: If your application normally sends SMS to a defined set of countries and you begin to receive requests for a different country, then then this should be investigated.
  • Spike in Outgoing Messages: A significant and sudden increase in outgoing messages could indicate an issue that requires investigation.
  • Spike in Messages Sent to a Block of Adjacent Numbers: Fraudsters often deploy bots and programmatically loop through numbers in a sequence. You will probably notice an increase in messages to a group of nearby numbers frequently for example, +11111111110, +11111111111

How to Identify and Prevent SMS Pumping Attacks:

Now that we understand the common signs of SMS pumping, lets discuss how to use AWS Services to identify, confirm the fraud and how to place measures in place to prevent it in the first place.

Identify:

Delivery Statistics (UTC)

Delivery Statistics (UTC)

If you are using Amazon Pinpoint, you can use transactional messaging under analytics section to understand the SMS patterns

Transactional Messaging Charts

Transactional Messaging Charts

  • Spikes in Messages Sent to a Block of Adjacent Numbers: If you are using SNS you can use CloudWatch logs to analyse the destination numbers.

You can use CloudWatch Insights query on below log groups

sns/<region>/<Accountnumber>/DirectPublishToPhoneNumber
sns/<region>/<Accountnumber>/DirectPublishToPhoneNumber/failure

The below query will print all the logs that have the destination number like +11111111111
fields @timestamp, @message, @logStream, @log
| filter delivery.destination like '+11111111111'
| limit 20

If you are using Amazon Pinpoint, you can enable event stream to analyse destination numbers.

If you have deployed Digital User Engagement Events Database Solution You can use the below sample Amazon Athena query which displays entries that have the destination number like +11111111111

SELECT * FROM "due_eventdb"."sms_success" where destination_phone_number like '%11111111111%'
SELECT * FROM "due_eventdb"."sms_failure" where destination_phone_number like '%11111111111%'

How to Prevent SMS Pumping: 

      • Example: If you expect only users from India to sign up in your application, you can include rules such as “\+91[0-9]{10}”, which allows only Indian numbers as input.
      • Note: SNS and Pinpoint APIs are not natively integrated with WAF. However, you can connect your application to an Amazon API Gateway with which you can integrate with WAF.
      • How to Create a Regex Pattern Set with WAF – The below Regex Pattern set will allow sending messages to Australia (+61) and India (+91) destination phone numbers
          1. Sign in to the AWS Management Console and navigate to AWS WAF console
          2. In the navigation pane, choose Regex pattern sets and then Create regex pattern set.
          3. Enter a name and description for the regex pattern set. You’ll use these to identify it when you want to use the set. For example, Allowed_SMS_Countries
          4. Select the Region where you want to store the regex pattern set
          5. In the Regular expressions text box, enter one regex pattern per line
          6. Review the settings for the regex pattern set, and choose Create regex pattern set
Regex pattern set details

Regex pattern set details

      • Create a Web ACL with above Regex Pattern Set
          1. Sign in to the AWS Management Console and navigate to AWS WAF console
          2. In the navigation pane, choose Web ACLs and then Create web ACL
          3. Enter a Name, Description and CloudWatch metric name for Web ACL details
          4. Select Resource type as Regional resources
          5. Click Next

            Web ACL details

            Web ACL details

          6. Click on Add Rules > Add my own rules and rule groups
          7. Enter Rule name and select Regular rule

            Web ACL Rule Builder

            Web ACL Rule Builder

          8. Select Inspect > Body, Content type as JSON, JSON match scope as Values, Content to inspect as Full JSON content
          9. Select Match type as Matches pattern from regex pattern set and select the Regex pattern set as “Allowed_SMS_Countries” created above
          10. Select Action as Allow
          11. Click Add Rule  

            Web ACL Rule builder statement

            Web ACL Rule builder statement

          12. Select Block for Default web ACL action for requests that don’t match any rules

            Web ACL Rules

            Web ACL Rules

          13. Set rule priority and Click Next

            Web ACL Rule priority

            Web ACL Rule priority

          14. Configure metrics and Click Next

            Web ACL metrics

            Web ACL metrics

          15. Review and Click Create web ACL

For more information, please refer to WebACL

  • Rate Limit Requests
    • AWS WAF provides an option to rate limit per originating IP. You can define the maximum number of requests allowed in a five-minute period that satisfy the criteria you provide, before limiting the requests using the rule action setting
  • CAPTCHA
    • Implement CAPTCHA in your application request process to protect your application against common bot traffic
  • Turn off “Shared Routes”
  • Exponential Delay Verification Retries
    • Implement a delay between multiple messages to the same phone number. This doesn’t completely eliminate but will help slow down the attack
  • Set CloudWatch Alarm
  • Validate Phone Numbers – You can use the Pinpoint Phone number validate API to check the values for CountryCodeIso2, CountryCodeNumeric, and PhoneType prior to sending SMS and then only send SMS to countries that match your criteria
    Sample API Response:

{
"NumberValidateResponse": {
"Carrier": "ExampleCorp Mobile",
"City": "Seattle",
"CleansedPhoneNumberE164": "+12065550142",
"CleansedPhoneNumberNational": "2065550142",
"Country": "United States",
"CountryCodeIso2": "US",
"CountryCodeNumeric": "1",
"OriginalPhoneNumber": "+12065550142",
"PhoneType": "MOBILE",
"PhoneTypeCode": 0,
"Timezone": "America/Los_Angeles",
"ZipCode": "98101"
}
}

Conclusion:

This post covers the basics of SMS pumping attacks, the different mechanisms that can be used to detect them, and some potential ways to solve for or mitigate them using services and features like Pinpoint Validate API and WAF.

Further Reading:
Review the documentation of WAF with API gateway
here
Review the documentation of Phone number validate
here
Review the Web Access Control lists
here

 

Resources:
Amazon Pinpoint –
https://aws.amazon.com/pinpoint/
Amazon API Gateway –
https://aws.amazon.com/api-gateway/
Amazon Athena –
https://aws.amazon.com/athena/

Automate marketing campaigns with real-time customer data using Amazon Pinpoint

Post Syndicated from Rushabh Lokhande original https://aws.amazon.com/blogs/messaging-and-targeting/automate-marketing-campaigns-with-real-time-customer-data-using-amazon-pinpoint/

Amazon Pinpoint offers marketers and developers one customizable tool to deliver customer communications across channels, segments, and campaigns at scale. Amazon Pinpoint makes it easy to run targeted campaigns and drive customer communications across different channels: email, SMS, push notifications, in-app messaging, or custom channels. Amazon Pinpoint campaigns enables you define which users to target, determine which messages to send, schedule the best time to deliver the messages, and then track the results of your campaign.

In many cases, the customer data resides in a third-party system such as a CRM, Customer Data Platform, Point of Sales, database and data warehouse. This customer data represents a valuable asset for your organization. Your marketing team needs to leverage each piece of this data to elevate the customer experience.

In this blog post we will demonstrate how you can leverage users’ clickstream data stored in database to build user segments and launch campaigns using Amazon Pinpoint. Also, we will showcase the full architecture of the data pipeline including other AWS services such as Amazon RDS, AWS Data Migration Service, Amazon Kinesis and AWS Lambda.

Let us understand our case study with an example: a customer currently has digital touch points such as a Website and a Mobile App to collect the users’ clickstreams and behavioral data where they are storing them in a MySQL database. Marketing teams want to leverage the collected data to deliver a personalized experience by leveraging Amazon Pinpoint capabilities.

You can find below the detail of a specific use case covered by the proposed solution:

  • All the clickstream and customer data are stored in MySQL DB
  • Your marketing team wants to create a personalized Amazon Pinpoint campaigns based on the user status and experience. Ex:
    • Customers who interested in specific offering to activate for them campaign based on their interest
    • Communicate with the preferred language of the user

Please note that this use case is used to showcase the proposed solution capabilities. However, it is not limited to this specific use case since you can leverage any customer collected dimension/attribute to create specific campaign to achieve a specific marketing use case.

In this post, we provide a guided journey on how marketers can collect, segment, and activate audience segments in real-time to increase their agility in managing campaigns.

Overview of solution

The use case covered in this post, focuses on demonstrating the flexibility offered by Amazon Pinpoint in both inbound (Ingestion) and outbound (Activation) stream of customer data. For the inbound stream, Amazon Pinpoint gives you a variety of ways to import your customer data, including:

  1. CSV/JSON import from the AWS console
  2. API operation to create a single or multiple endpoints
  3. Programmatically create and execute import jobs

We will focus on building a real-time inbound stream of customer data available within an Amazon RDS MySQL database specifically. It is important to mention that similar approach can be implemented to ingest data from third-party systems if any.

For the outbound stream, activating customer data using Amazon Pinpoint can be achieved using the following two methods:

  1. Campaign: a campaign is a messaging initiative that engages a specific audience segment.
  2. Journey: a journey is a customized, multi-step engagement experience.

The result of customer data activation cannot be completed without specifying the targeted channel. A channel represents the platform through which you engage your audience segment with messages. For example, Amazon Pinpoint customers can optimize how they target notifications to prospective customers through LINE message and email. They can deliver notifications with more information on prospected customer’s product information such as sales, new products etc. to the appropriate audience.

Amazon Pinpoint supports the following channels:

  • Push notifications
  • Email
  • SMS
  • Voice
  • In-app messages

In addition to these channels, you can also extend the capabilities to meet your specific use case by creating custom channels. You can use custom channels to send messages to your customers through any service that has an API including third-party services. For example, you can use custom channels to send messages through third-party services such as WhatsApp or Facebook Messenger. We will focus on developing an Amazon Pinpoint connector using custom channel to target your customers on third-party services through API.

Solution Architecture

The below diagram illustrates the proposed architecture to address the use case. Moving from left to right:

Fig 1: Architecture Diagram for the Solution

Fig 1: Architecture Diagram for the Solution

  1. Amazon RDS: This hosts customer database where you can have one or many tables contains customer data.
  2. AWS Data Migration Service (DMS): This acts as the glue between Amazon RDS MySQL and the downstream services by replicating any transformation that happens at the record level in the configured customer tables.
  3. Amazon Kinesis Data Streams: This is the destination endpoint for AWS DMS. It will carry all the transformed records for the next stage of the pipeline.
  4. AWS Lambda (inbound): The inbound AWS Lambda triggers the Kinesis Data Streams, process the mutated records, and ingest them in Amazon Pinpoint.
  5. Amazon Pinpoint: This act as the centralized place to define customer segments and launch campaigns.
  6. AWS Lambda (outbound): This act as the custom channel destination for the campaigns activated from Amazon Pinpoint.

To illustrate how to set up this architecture, we’ll walk you through the following steps:

  1. Deploying an AWS CDK stack to provision the following AWS Resources
  2. Validate the Deployment.
  3. Run a Sample Workflow – This workflow will run an AWS Glue PySpark job that uses a custom Python library, and an upgraded version of boto3.
  4. Cleaning up your resources.

Prerequisites

Make sure that you complete the following steps as prerequisites:

The Solution

Launching your AWS CDK Stack

Step 1a: Open your device’s command line or Terminal.

Step1b: Checkout Git repository to a local directory on your device:

git clone https://github.com/aws-samples/amazon-pinpoint-realtime-campaign-optimization-example.git

Step 2: Change directories to the new directory code location:

cd amazon-pinpoint-realtime-campaign-optimization-example

Step 3: Update your AWS account number and region:

  1. Edit config.py with your choice to tool or command line
  2. look for section “Account Setup” and update your account number and region

    Fig 2: Configuring config.py for account-id and region

    Fig 2: Configuring config.py for account-id and region

  3. look for section “VPC Parameters” and update your VPC and subnet info

    Fig 3: Configuring config.py for VPC and subnet information

    Fig 3: Configuring config.py for VPC and subnet information

Step 4: Verify if you are in the directory where app.py file is located:

ls -ltr app.py

Step 5: Create a virtual environment:

macOS/Linux:

python3 -m venv .env

Windows:

python -m venv .env

Step 6: Activate the virtual environment after the init process completes and the virtual environment is created:

macOS/Linux:

source .env/bin/activate

Windows:

.env\Scripts\activate.bat

Step 7: Install the required dependencies:

pip3 install -r requirements.txt

Step 8: Bootstrap the cdk app using the following command:

cdk bootstrap aws://<AWS_ACCOUNTID>/<AWS_REGION>

Replace the place holder AWS_ACCOUNTID and AWS_REGION with your AWS account ID and the region to be deployed.
This step provisions the initial resources, including an Amazon S3 bucket for storing files and IAM roles that grant permissions needed to perform deployments.

Fig 4: Bootstrapping CDK environment

Fig 4: Bootstrapping CDK environment

Please note, if you have already bootstrapped the same account previously, you cannot bootstrap account, in such case skip this step or use a new AWS account.

Step 9: Make sure that your AWS profile is setup along with the region that you want to deploy as mentioned in the prerequisite. Synthesize the templates. AWS CDK apps use code to define the infrastructure, and when run they produce or “synthesize” a CloudFormation template for each stack defined in the application:

cdk synthesize

Step 10: Deploy the solution. By default, some actions that could potentially make security changes require approval. In this deployment, you’re creating an IAM role. The following command overrides the approval prompts, but if you would like to manually accept the prompts, then omit the –require-approval never flag:

cdk deploy "*" --require-approval never

While the AWS CDK deploys the CloudFormation stacks, you can follow the deployment progress in your terminal.

Fig 5: AWS CDK Deployment progress in terminal

Fig 5: AWS CDK Deployment progress in terminal

Once the deployment is successful, you’ll see the successful status as follows:

Fig 6: AWS CDK Deployment completion success

Fig 6: AWS CDK Deployment completion success

Step 11: Log in to the AWS Console, go to CloudFormation, and see the output of the ApplicationStack:

Fig 7: AWS CloudFormation stack output

Fig 7: AWS CloudFormation stack output

Note the values of PinpointProjectId, PinpointProjectName, and RDSSecretName variables. We’ll use them in the next step to upload our artifacts

Testing The Solution

In this section we will create a full data flow using the below steps:

  1. Ingest data in the customer_tb table within the Amazon RDS MySQL DB instance
  2. Validate that AWS Data Migration Service created task is replicating the changes to the Amazon Kinesis Data Streams
  3. Validate that endpoints are created within Amazon Pinpoint
  4. Create Amazon Pinpoint Segment and Campaign and activate data to Webhook.site endpoint URL

Step 1: Connect to MySQL DB instance and create customer database

    1. Sign in to the AWS Management Console and open the AWS Cloud9 console at https://console.aws.amazon.com/cloud9 
    2. Click Create environment
      • Name: mysql-cloud9-01 (for example)
      • Click Next
      • Environment type: Create a new EC2 instance for environment (direct access)
      • Instance type: t2.micro
      • Timeout: 30 minutes
      • Platform: Amazon Linux 2
      • Network settings under VPC settings select the same VPC where the MySQL DB instance was created. (this is the same VPC and Subnet from step 3.3)
      • Click Next
      • Review and click Create environment
    3. Select the created AWS Cloud9 from the AWS Cloud9 console at https://console.aws.amazon.com/cloud9  and click Open in Cloud9. You will have access to AWS Cloud9 Linux shell.
    4. From Linux shell, update the operating system and :
      sudo yum update -y
    5. From Linux shell, update the operating system and :
      sudo yum install -y mysql
    6. To connect to the created MySQL RDS DB instance, use the below command in the AWS Cloud9 Linux shell:
      mysql -h <<host>> -P 3308 --user=<<username>> --password=<<password>>
      • To get values for dbInstanceIdentifier, username, and password
        • Navigate to the AWS Secrets Manager service
        • Open the secret with the name created by the CDK application
        • Select ‘Reveal secret value’ and copy the respective values and replace in your command
      • After you enter the password for the user, you should see output similar to the following.
      • Welcome to the MariaDB monitor.  Commands end with ; or \g.
        Your MySQL connection id is 27
        Server version: 8.0.32 Source distribution
        Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
        Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
        MySQL [(none)]>

Step 2: Ingest data in the customer_tb table within the Amazon RDS MySQL DB instance Once the connection to the MySQL DB instance established, using the same AWS Cloud9 Linux shell connected to the MySQL RDS DB execute following commands.

  • Create database pinpoint-test-db:
    CREATE DATABASE `pinpoint-test-db`;
  • Create table customer-tb:
    Use `pinpoint-test-db`;
    CREATE TABLE `customer_tb` (`userid` int NOT NULL,
                                `email` varchar(150) DEFAULT NULL,
                                `language` varchar(45) DEFAULT NULL,
                                `favourites` varchar(250) DEFAULT NULL,
                                PRIMARY KEY (`userid`);
  • You can verify the schema using the below SQL command:
  1. DESCRIBE `pinpoint-test-db`.customer_tb;
    Fig 8: Verify schema for customer_db table

    Fig 8: Verify schema for customer_db table

    • Insert records in customer_tb table:
    Use `pinpoint-test-db`;
    insert into customer_tb values (1,'[email protected]','english','football');
    insert into customer_tb values (2,'[email protected]','english','basketball');
    insert into customer_tb values (3,'[email protected]','french','football');
    insert into customer_tb values (4,'[email protected]','french','football');
    insert into customer_tb values (5,'[email protected]','french','basketball');
    insert into customer_tb values (6,'[email protected]','french','football');
    insert into customer_tb values (7,'[email protected]','french',null);
    insert into customer_tb values (8,'[email protected]','english','football');
    insert into customer_tb values (9,'[email protected]','english','football');
    insert into customer_tb values (10,'[email protected]','english',null);
    • Verify records in customer_tb table:
    select * from `pinpoint-test-db`.`customer_tb`;
    Fig 9: Verify data for customer_db table

    Fig 9: Verify data for customer_db table

    Step 3: Validate that AWS Data Migration Service created task is replicating the changes to the Amazon Kinesis Data Streams

      1. Sign in to the AWS Management Console and open the AWS DMS console at https://console.aws.amazon.com/dms/v2 
      2. From the navigation panel, choose Database migration tasks.
      3. Click on the created task created by CDK code ‘dmsreplicationtask-*’
      4. Start the replication task

        Fig 10: Starting AWS DMS Replication Task

        Fig 10: Starting AWS DMS Replication Task

      5. Make sure that Status is Replication ongoing

        Fig 11: AWS DMS Replication statistics

        Fig 11: AWS DMS Replication statistics

      6. Navigate to Table Statistics and make sure that the number of Inserts is equal to 10 and Load state is Table completed*

        Fig 12: AWS DMS Replication statistics

        Fig 12: AWS DMS Replication statistics

    Step 4: Validate that endpoints are created within Amazon Pinpoint

    1. Sign in to the AWS Management Console and open the Amazon Pinpoint console at https://console.aws.amazon.com/pinpoint/ 
    2. Click on Amazon Pinpoint Project Demo created by CDK stack “dev-pinpoint-project”
    3. From the left menu, click on Analytics and validate that the Active targetable endpoints are equal to 10 as shown below:
    Fig 13: Amazon Pinpoint endpoint summary

    Fig 13: Amazon Pinpoint endpoint summary

    Step 5: Create Amazon Pinpoint Segment and Campaign

    Step 5.1: Create Amazon Pinpoint Segment

    • Sign in to the AWS Management Console and open the Amazon Pinpoint console at https://console.aws.amazon.com/pinpoint/ 
    • Click on Amazon Pinpoint Project Demo created by CDK stack “dev-pinpoint-project”
    • from the left menu, click on Segments and click Create a segment
    • create Segment using the below configurations:
      • Name: English Speakers
      • Under criteria:
    • Attribute: Language
    • Operator: Conatins
    • Value: english
    Fig 14: Amazon Pinpoint segment summary

    Fig 14: Amazon Pinpoint segment summary

    • Click create segment

    Step 5.2: Create Amazon Pinpoint Campaign

    • from the left menu, click on Campaigns and click Create a campaign
    • set the Campaign name to test campaign and select Custom option for Channel as shown below:
    Fig 15: Amazon Pinpoint create campaign

    Fig 15: Amazon Pinpoint create campaign

    • Click Next
    • Select English Speakers from Segment as shown below and click Next:
    Fig 16: Amazon Pinpoint segment

    Fig 16: Amazon Pinpoint segment

    • Choose Lambda function channel type and select outbound lambda function with name pattern as ApplicationStack-lambdaoutboundfunction* from the dropdown as shown below:
    Fig 17: Amazon Pinpoint message creation

    Fig 17: Amazon Pinpoint message creation

    • Click Next
    • Choose At a specific time option and immediately to send the campaign as show below:
    Fig 18: Amazon Pinpoint campaign scheduling

    Fig 18: Amazon Pinpoint campaign scheduling

    If you push more messages or records into Amazon RDS (from step 2.4), you will need to create a new campaign (from step 4.2) to process the new messages.

    • Click Next, review the configuration and click Launch campaign.
    • Navigate to dev-pinpoint-project and select the campaign created in previous step. You should see status as ‘Complete’
    Fig 19: Amazon Pinpoint campaign processing status

    Fig 19: Amazon Pinpoint campaign processing status

    • Navigate to dev-pinpoint-project dashboard and select your campaign in ‘Campaign metrics’ dashboard, you will see the statistics for the processing.
    Fig 20: Amazon Pinpoint campaign metrics

    Fig 20: Amazon Pinpoint campaign metrics

    Accomplishments

    This is a quick summary of what we accomplished:

    1. Created an Amazon RDS MySQL DB instance and define customer_tb table schema
    2. Created an Amazon Kinesis Data Stream
    3. Replicated database changes from the Amazon RDS MySQL DB to Amazon Kinesis Data Stream
    4. Created an AWS Lambda function triggered by Amazon Kinesis Data Stream to ingest database records in Amazon Pinpoints as User endpoints using AWS SDK
    5. Created an Amazon Pinpoint Project, segment and campaign
    6. Created an AWS Lambda function as custom channel for Amazon Pinpoint campaign
    7. Tested end-to-end data flow from Amazon RDS MySQL DB instance to third party endpoint

    Next Steps

    You have now gained a good understanding of Amazon Pinpoint agnostic data flow but there are still many areas left for exploration. What this workshop hasn’t covered is the operation of other communication channels such as Email, SMS, Push notification and Voice outbound. You can enable the channels that are pertinent to your use case and send messages using campaigns or journeys.

    Clean Up

    Make sure that you clean up all of the other AWS resources that you created in the AWS CDK Stack deployment. You can delete these resources via the AWS CDK Destroy command as follows or the CloudFormation console.

    To destroy the resources using AWS CDK, follow these steps:

    • Follow Steps 1-5 from the ‘Launching your CDK Stack’ section.
    • Destroy the app by executing the following command:
    cdk destroy

    Summary

    In this post, you have now gained a good understanding of Amazon Pinpoint flexible real-time data flow. By implementing the steps detailed in this blog post, you can achieve a seamless integration of your customer data from Amazon RDS MySQL database to Amazon Pinpoint where you can leverage segments and campaigns to activate data using custom channels to third-party services via API. The demonstrated use case focuses on Amazon RDS MySQL database as a data source. However, there are still many areas left for exploration. What this post hasn’t covered is the operation of integrating customer data from other type of data sources such as MongoDB, Microsoft SQL Server, Google Cloud, etc. Also, other communication channels such as Email, SMS, Push notification and Voice outbound can be used in the activation layer. You can enable the channels that are pertinent to your use case and send messages using campaigns or journeys, and get a complete view of their customers across all touchpoints and can lead to less relevant marketing campaigns.

    About the Authors

  2. Bret Pontillo is a Senior Data Architect with AWS Professional Services Analytics Practice. He helps customers implement big data and analytics solutions. Outside of work, he enjoys spending time with family, traveling, and trying new food.
    Rushabh Lokhande is a Data & ML Engineer with AWS Professional Services Analytics Practice. He helps customers implement big data, machine learning, and analytics solutions. Outside of work, he enjoys spending time with family, reading, running, and golf.
    Ghandi Nader is a Senior Partner Solution Architect focusing on the Adtech and Martech industry. He helps customers and partners innovate and align with the market trends related to the industry. Outside of work, he enjoys spending time cycling and watching formula one.

How to implement multi-tenancy with Amazon Pinpoint

Post Syndicated from Tristan Nguyen original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-implement-multi-tenancy-with-amazon-pinpoint/

Navigating Multi-Tenancy in Amazon Pinpoint

Businesses are constantly evolving, often managing multiple product lines, customer segments, or even geographical locations. Furthermore, many business-to-business (B2B) companies that are Independent Software Vendors (ISVs) will often need to manage their customer’s marketing automation environment. This complexity necessitates a robust customer engagement strategy that can adapt and scale efficiently. However, managing disparate systems for each tenant is not only cumbersome but also resource-intensive, leading to increased operational costs and potential data silos. A multi-tenancy setup in Amazon Pinpoint addresses these challenges head-on, allowing businesses to streamline their customer engagement efforts under a unified architecture.

The question is not just whether to adopt multi-tenancy, but how to implement it in a way that aligns with your unique business requirements. Amazon Pinpoint offers multiple approaches to achieve this. This blog explores three:

  • Single Pinpoint Project: Simple but demands careful permissions management.
  • Multiple Pinpoint Projects: Granular control but limited by soft project quotas.
  • Multiple Account & Multi Pinpoint Projects: Highly scalable but needs comprehensive monitoring.

We’ll delve into the pros, cons, and best use-cases for each as well as how to choose the different multi-tenancy configuration depending on your communications channels needs, guiding you to make an informed architectural decision.

In this blog, we’ll cut through the complexity, helping you align your Amazon Pinpoint architecture with your business goals. Let’s get started.

Single Account / Single Project (SA/SP)

Overview

In a Single Pinpoint Project setup, all customer engagement activities reside within one project and multi-tenancy within this context will leverage customer endpoint attributes. This streamlined approach allows for easy management, especially for those new to Amazon Pinpoint. A configuration example for this case is shown below:

Single Account / Single Project (SA/SP)

When preparing one Pinpoint Project and managing information for multiple tenants, tenant information can be managed by using custom user attributes of endpoints. Also, campaign information can be managed for each tenant by using the tag function for campaign information. The elements required to take this configuration are shown below.

  • S3 buckets that hold customer data:
    • Prepare an S3 bucket to store customer information lists to be imported into Pinpoint. Amazon Pinpoint allows you to import CSV files in S3 as segments. In order to make settings for each tenant in Amazon Pinpoint, we will include tenant information as custom user attributes in the CSV file.
  • 1 Amazon Pinpoint Project:
    • Create 1 Amazon Pinpoint Project.
    • Settings for each channel to be distributed are also required.
    • Campaign information can be assigned to tenant information by using the tag function.
  • Amazon Kinesis:
  • Athena and S3 buckets to analyze event data:
    • Store Amazon Pinpoint event data in S3 and analyze it via Athena. Take advantage of this solution.

One thing to keep in mind when adopting this configuration is that customer endpoint information exists in the same Pinpoint Project. It is possible to specify values that can be used to identify each tenant, such as custom attributes, and solve the problem with AWS Identity and Access Management (IAM) policies, but it is necessary to manage access rights and attributes on your own.

Also, to add an endpoint, you’ll need to specify its Channel and Address. Take note that one project cannot have the same channel and address for different endpoints. From the above, if the channel and address of the endpoint do not overlap between tenants, it is possible to construct your own access permission control, then this pattern can be examined.

Since fewer components are required compared to other patterns, the configuration is easier to start with. Some customers that want to build on top of Pinpoint API and want to simplify configuration on the Pinpoint side as much as possible can also choose this option. However, this approach can get complex to manage later on as you onboard more tenants. The issue presents itself when you want to create detailed reporting for your tenant in this configuration. You’ll have to have dedicated tags on each campaigns, journeys to operationalize granular reporting for your Amazon Pinpoint project.

Lastly, take note of service limits per Amazon Pinpoint project/AWS account to ensure your use case will be scalable should the need arise.

Single Account / Multiple Projects (SA/MP)

Overview

For this architecture, you are still using a single AWS account to host your Amazon Pinpoint environment, however, you will be creating multiple projects for each customer or tenant. A configuration example for this case is shown diagram.

Single Account / Multiple Projects (SA/MP)

In this example, we will create multiple Amazon Pinpoint Projects. One major difference from the case of the Single Pinpoint Project is that it is possible to completely separate customer endpoint information. When importing customer data segments, it is possible to manage each tenant in a separate state simply by importing them from S3 into the target Pinpoint Project. This makes it easy to control permissions via IAM policies.

Also, with Amazon Pinpoint, you can use email addresses, SMS numbers, message templates, etc. for transmission obtained with the relevant account in common to all projects, and event data for each project can be aggregated via Amazon Kinesis. By adopting such a configuration, you gain the benefits of separating endpoint information per project while still retaining basic setting information management and operator operations.

An example starter solution architecture to set up this configuration are shown below.

  • S3 buckets that hold customer data:
    • Similar to SA/SP, prepare an S3 bucket to store a list of customer information to be imported into Pinpoint. CSV to be imported must be prepared for each project.
  • Amazon DynamoDB Table:
    • Prepare a DynamoDB (or other key-value database) table to manage Pinpoint project information. Tenant information can also be stored as metadata in the DynamoDB table.
  • AWS Lambda:
    • Create a Pinpoint Project using Lambda. Amazon Pinpoint allows you to create and configure projects using the Amazon Pinpoint API, the AWS SDK, or the AWS Command Line Interface (AWS CLI). Thus, it is possible to automate the creation of the Pinpoint project and associated campaigns/journeys. Tenant information is also registered in DynamoDB at the time of creation.
  • Multiple Amazon Pinpoint Projects:
    • This is a Project created by Lambda above. There will now be a Pinpoint Project for each tenant, and endpoint information will be completely separated. It is also easy to control access rights for each project by using the IAM function.
    • Message templates: templates can be created and shared across projects.
    • By using Amazon Pinpoint’s event stream settings, campaign/journeys/app/channels events can be streamed to Amazon Kinesis. Multiple Amazon Pinpoint projects can all stream to one Amazon Kinesis stream. When setup correctly, event data will be tagged with the relevant tenant information so that an analytics solution can decompose the stream later on.
  • Athena and S3 buckets to analyze event data:
    • Amazon Pinpoint event data is stored in Amazon S3 and analyzed via Amazon Athena. The analytics solution, Amazon Athena in this case will be responsible for filtering event data and according to the tenant. Refer to this solution for more details.

Note that Pinpoint projects have a soft limit of 100 projects per AWS account, which can be increased via raising a Support Ticket, other quotas also apply at the project and the account level which should be taken into account.

From the above, it is necessary to note that there are restrictions on quotas per account when using the SA/MP and more initial configurations would be required to automate the process of project creation for individual tenants. However, when compared to SA/SP architecture,

Multiple Accounts & Multi Pinpoint Projects (MA/MP)

Overview

Before diving into the MA/MP approach, it’s crucial to understand the role of AWS Organizations in this configuration. AWS Organizations allows you to consolidate multiple AWS accounts into an organization to achieve centralized governance and billing. This feature is particularly useful in a MA/MP setup, as it enables streamlined management of multiple AWS accounts and Amazon Pinpoint projects from a single central management AWS account. For more information on AWS Organizations, you can visit the official AWS Organizations documentation.

In an MA/MP setup, we utilize separate AWS Accounts for each customer or tenant. A configuration example for this case is shown below.

In this example, we have created a Management account and prepared multiple AWS accounts under it. The management account manages the AWS account ID and the Pinpoint project ID, and has a configuration created with Lambda. Customer data and Event Stream Data are managed through a Management account, and information on each project is aggregated. A major benefit of this configuration is the ability to segregate actions of individual tenants, preventing the such as noisy neighbours antipattern. It also enables AWS accounts from being freed from quota restrictions that cannot be handled by a single AWS account. Additionally, Amazon Pinpoint has excellent CloudFormation coverage, and it is also possible to deploy highly reproducible architectures automatically.

The elements required to set up this configuration are shown below.

  • AWS Organizations:
    • Set up Organizations to manage multiple accounts. See Best Practices for setting up multiple accounts.
  • Management account:
    • Create an account to manage multiple account information. Here we will set the following elements. Use IAM roles and Service control policies (SCPs) when manipulating resources across accounts. This allows cross-account access. The required elements are the same as the SA/MP described above.
      • S3 buckets that hold customer data: With AWS, you can utilize S3 data across accounts. Set up cross-account settings and securely link customer data to each account.
      • Dynamo DB Table: Holds your AWS account ID, Pinpoint Project ID, and management information associated with it.
      • AWS Lambda: Create a Pinpoint project using Lambda.
      • Athena and S3 buckets to analyze event data: Event information from multiple accounts and Pinpoint projects is aggregated and analyzed.
  • AWS accounts and Pinpoint projects per tenant:
    • Depending on how tenants are separated, prepare an AWS account and Pinpoint Project. You can also consider automating account creation by using AWS CloudFormation.
    • There are cases where it is necessary to set the distribution channel email address, SMS number, etc. for each account. See the next section for details.
    • Amazon Kinesis is prepared for each account, but everything is stored in the same S3 in the Management account for easier bird-eye’s view reporting.

One thing to keep in mind is that since accounts are separated, it becomes necessary to manage each one separately. For example, newly created account will be placed in the sandbox state, and an application for actual use via support tickets is required for each account. Also, since all reputation is done on a single account, it is also necessary to monitor reputation for each account.

Navigating Channels in Amazon Pinpoint: Aligning Service Delivery with Architecture

Beyond choosing a Pinpoint architecture for multi-tenancy, it’s pivotal to decide which channels best deliver your services and how that decision is affected by your choice of multi-tenancy architecture. Below is a non-exhaustive lists of capabilities in Amazon Pinpoint that will help with your multi-channel, multi-tenancy configurations as well as potential blockers that you’d need to be aware of for each channels.

Email

Email is one of the most versatile channels, with integration with Amazon SES’s configuration sets and email suppression list capability, easily fitting into any of the three multi-tenancy models.

  • Configurations Sets: Using configuration sets, you’d be able to segregate your email sending activities using different IP Pools, as well as different event destinations.
    • You can use configuration sets in both Amazon Pinpoint and Amazon SES. Configuration sets rules that you configure in Amazon SES are also applied to email messages that you send using Amazon Pinpoint.
    • SA/SP and SA/MP: Email templates and sending IP addresses needs to be tagged using configuration sets for each tenant in the Pinpoint project.
    • MA/MP: Email templates and sending IP address can be sent using the account default, or follow granular tagging using configuration sets.
  • Email Suppression List: Suppression list is managed automatically at the account level. Alternatively, you can specify whether a specific configuration can override the account-level suppression list.
    • SA/SP and SA/MP:
      • All tenants will also follow the same account suppression list:
        • If any tenant sends to an email address that hard-bounced or complaint, all other tenants will also be unable to send emails to the same address.
        • You will have to manually override the account-level suppression list for each email addresses.
    • MA/MP:
      • If one of your tenant sends an email to a hard-bounced or complaint address, only the AWS account that the tenant belongs to will respect the suppression list i.e. other tenants in other AWS account can still send email to that email address.
  • Noisy Neighbour Threat: Broadly, this occurs when one tenant’s performance is degraded because of the activities of another tenant. Applied to email, the anti-pattern needs to be addressed because you don’t want one bad actor tenant to affect the entire environment’s email sending activity.
    • SA/SP and SA/MP:
      • Because email bounce and complaint rates are tracked at the account level, it is possible your entire account email sending domain to be blocked due to high bounce/complaint incidences from one bad tenant.
      • To mitigate this, it’s best practice to set up dedicated configuration sets and alarms to alert when any individual tenant is exhibiting high bounce/complaint rate.
    • MA/MP:
      • Offers the most segregation and ensure email identities/domains are only usable by one tenant/account.
  • Email Sending Quota:
    • Email daily sending quota and email sending rate live at the account level.
    • SA/SP and SA/MP:
      • You would need to anticipate the total daily sending quota and sending rate for all tenants in your AWS account and raise the service limits accordingly. Therefore, more planning will be involved to estimate the correct service limit threshold.
    • MA/MP:
      • You can raise service limits per individual tenant’s needs since each tenant will be on a separate AWS account.
      • It is best practice to have business process in place for individual tenant to notify of their email sending quota request in advance so that it can be raised accordingly for their AWS account.
  • For further discussion into sending emails in a multi-tenancy environment, refer to this AWS blog on Multi-Tenancy in SES.

SMS

  • Origination Identity procurement: When opting for MA/MP setup, remember that OIDs (phone numbers) are bound to AWS accounts.
  • Since OIDs do not carry across account, you will need to repeat the procurement process for every new AWS account.Number Pooling: This feature groups phone numbers or sender IDs. It’s particularly useful in a Single Project model to segment communications per tenant.
  • Configuration Sets: With the release of the V2 SMS and Voice API, you can now use configuration sets to manage your SMS opt-out lists, OIDs and event streaming destinations for a multi-tenant environment.
  • Noisy Neighbour Threat:
    • SA/SP and SA/MP:
      • Take note that if you do not specify an OID in your API call, Amazon Pinpoint will attempt to use the most suitable (in terms of throughput and deliverability) OID to send your SMS. This
      • Similar to email, you can leverage number pooling and configuration sets to segregate SMS sending activity within a single account. This helps protect’s your SMS OID reputation because it can be costly and time-consuming to request new OIDs.
    • MA/MP:
      • Offers the most segregation and ensure numbers are only usable by one tenant/account.
  • SMS Opt Outs: Similar to the email channel’s suppression list, opt-outs are managed per account and configuration sets. Therefore, in a MA/MP setup, a customer that has opted out from communication in one account can still receive communications from other accounts.

Push Notifications

Amazon Pinpoint integrates with various push services like FCM, APNS, Baidu Cloud Push, and ADM.

  • Project-level Authentication: Authentication information is set at the Pinpoint Project level, requiring separate management.
    • Therefore, you will not be able to use the SA/SP architecture for multiple tenants using different applications.
  • For more information, refer to the Mobile Push Guide

In-app Messages

  • Pinpoint Project Specific: Similar to push notifications, each Pinpoint Project can only house one in-app message application.
    • If you have multiple applications requiring in-app messages, you will not be able to employ the SA/SP architecture.
  • For more information, refer to the In-app Channel Documentation.

Custom Channels

  • Custom channels in Amazon Pinpoint allow you to send messages through any service that has an API, including third-party services. You can interact with APIs by using a webhook, or by calling an AWS Lambda function.If you are using custom channels extensively from Amazon Pinpoint, you’ll need to be aware of service limits in AWS Lambda, , especially if you’re considering SA/SP or SA/MP architectures.

Conclusion

In this blog, we’ve untangled the intricacies of implementing multi-tenancy in Amazon Pinpoint. Our deep dive covered three architectural patterns:

  • Single Account/Single Project (SA/SP): A beginner-friendly approach offering simple management but requiring meticulous permissions handling to segregate sending activity between different tenants.
  • Single Account/Multiple Projects (SA/MP): Offers granular control over customer data with slight increased in management complexity. However, this approach faces soft quotas and potential ‘Noisy Neighbor’ issues.
  • Multiple Accounts/Multiple Projects (MA/MP): Provides the most flexibility and isolation, albeit with increased management complexity.

Each approach comes with its own set of trade-offs related to ease of management/reporting, scalability, and control over customer data. Our discussion didn’t stop at architecture; we also examined how your multi-tenancy decisions will affect your channel configurations in Amazon Pinpoint. From email and SMS to push notifications, the architectural choices you make will have a direct impact on how efficiently you can manage these distribution channels. Armed with this information, you’re now better equipped to make informed decisions that align with your business objectives.

Call to Action

Your next step? Implement and architect your Amazon Pinpoint environment. Use the best practices and architectural guidelines outlined in this blog post as your north star. Going forward, the architectural blueprint you choose should be tailored to your specific needs—be it user count, company size, or distribution channels. Take into account not just the initial setup but also the long-term management aspects, including the respective service limits and quotas.

Relevant Links

About the Authors

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Tatsuya Nakamura

Tatsuya Nakamura

Nakamura Tatsuya is a Solutions Architect in charge of enterprise companies at AWS. He is mainly in charge of the trading company industry and the distribution/retail industry, also supporting the implementation of Amazon Pinpoint for Japanese customers. His career so far includes ERP implementation support and multiple new web service launches.

Amazon SES: Email Authentication and Getting Value out of Your DMARC Policy

Post Syndicated from Bruno Giorgini original https://aws.amazon.com/blogs/messaging-and-targeting/email-authenctication-dmarc-policy/

Amazon SES: Email Authentication and Getting Value out of Your DMARC Policy

Introduction

For enterprises of all sizes, email is a critical piece of infrastructure that supports large volumes of communication. To enhance the security and trustworthiness of email communication, many organizations turn to email sending providers (ESPs) like Amazon Simple Email Service (Amazon SES). These ESPs allow users to send authenticated emails from their domains, employing industry-standard protocols such as the Sender Policy Framework (SPF) and DomainKeys Identified Mail (DKIM). Messages authenticated with SPF or DKIM will successfully pass your domain’s Domain-based Message Authentication, Reporting, and Conformance (DMARC) policy. This blog post will focus on the DMARC policy enforcement mechanism. The blog will explore some of the reasons why email may fail DMARC policy evaluation and propose solutions to fix any failures that you identify. For an introduction to DMARC and how to carefully choose your email sending domain identity, you can refer to Choosing the Right Domain for Optimal Deliverability with Amazon SES The relationship between DMARC compliance and email deliverability rates is crucial for organizations aiming to maintain a positive sender reputation and ensure successful email delivery. There are many advantages when organizations have this correctly setup, these include:

  • Improved Email Deliverability
  • Reduction in Email Spoofing and Phishing
  • Positive Sender Reputation
  • Reduced Risk of Email Marked as Spam
  • Better Email Engagement Metrics
  • Enhanced Brand Reputation

With this foundation, let’s explore the intricacies of DMARC and how it can benefit your organization’s email communication.

What is DMARC?

DMARC is a mechanism for domain owners to advertise SPF and DKIM protection and to tell receivers how to act if those authentication methods fail. The domain’s DMARC policy protects your domain from third parties attempting to spoof the domain in the “From” header of emails. Malicious email messages that aim to send phishing attempts using your domain will be subject to DMARC policy evaluation, which may result in their quarantine or rejection by the email receiving organization. This stringent policy ensures that emails received by email recipients are genuinely from the claimed sending domain, thereby minimizing the risk of people falling victim to email-based scams. Domain owners publish DMARC policies as a TXT record in the domain’s _dmarc.<domain> DNS record. For example, if the domain used in the “From” header is example.com, then the domain’s DMARC policy would be located in a DNS TXT record named _dmarc.example.com. The DMARC policy can have one of three policy modes:

  • A typical DMARC deployment of an existing domain will start with publishing "p=none". A none policy means that the domain owner is in a monitoring phase; the domain owner is monitoring for messages that aren’t authenticated with SPF and DKIM and seeks to ensure all email is properly authenticated
  • When the domain owner is comfortable that all legitimate use cases are properly authenticated with SPF and/or DKIM, they may change the DMARC policy to "p=quarantine". A quarantine policy means that messages which fail to produce a domain-aligned authenticated identifier via SPF or DKIM will be quarantined by the mail receiving organization. The mail receiving organization may filter these messages into Junk folders, or take another action that they feel best protects their recipients.
  • Finally, domain owners who are confident that all of the legitimate messages using their domain are authenticated with SPF or DKIM, may change the DMARC policy to "p=reject". A reject policy means that messages which fail to produce a domain-aligned authenticated identifier via SPF or DKIM will be rejected by the mail receiving organization.

The following are examples of a TXT record that contains a DMARC policy, depending on the desired policy (the ‘p’ tag):

  Name Type Value
1 _dmarc.example.com TXT “v=DMARC1;p=reject;rua=mailto:[email protected]
2 _dmarc.example.com TXT “v=DMARC1;p=quarantine;rua=mailto:[email protected]
3 _dmarc.example.com TXT “v=DMARC1;p=none;rua=mailto:[email protected]
Table 1 – Example DMARC policy

This policy tells email providers to apply the DMARC policy to messages that fail to produce a DKIM or SPF authenticated identifier that is aligned to the domain in the “From” header. Alignment means that one or both of the following occurs:

  • The messages pass the SPF policy for the MAIL FROM domain and the MAIL FROM domain is the same as the domain in the “From” header, or a subdomain. Reference Using a custom MAIL FROM domain to learn more about how to send SPF aligned messages with SES.
  • The messages have a DKIM signature signed by a public key in DNS at a location within the domain of the “From” header. Reference Authenticating Email with DKIM in Amazon SES to learn more about how to send DKIM aligned messages with SES.

DMARC reporting

The rua tag in the domain’s DMARC policy indicates the location to which mail receiving organizations should send aggregate reports about messages that pass or fail SPF and DKIM alignment. Domain owners analyze these reports to discover messages which are using the domain in the “From” header but are not properly authenticated with SPF or DKIM. The domain owner will attempt to ensure that all legitimate messages are authenticated through analysis of the DMARC aggregate reports over time. Mail receiving organizations which support sending DMARC reports typically send these aggregated reports once per day, although these practices differ from provider to provider.

What does a typical DMARC deployment look like?

A DMARC deployment is the process of:

  1. Ensuring that all emails using the domain in the “From” header are authenticated with DKIM and SPF domain-aligned identifiers. Focus on DKIM as the primary means of authentication.
  2. Publishing a DMARC policy (none, quarantine, or reject) for the domain that reflects how the domain owner would like mail receiving organizations to handle unauthenticated email claiming to be from their domain.

New domains and subdomains

Deploying a DMARC policy is easy for organizations that have created a new domain or subdomain for the purpose of a new email sending use case on SES; for example email marketing, transaction emails, or one-time pass codes (OTP). These domains can start with the "p=reject" DMARC enforcement policy because the policy will not affect existing email sending programs. This strict enforcement is to ensure that there is no unauthenticated use of the domain and its subdomains.

Existing domains

For existing domains, a DMARC deployment is an iterative process because the domain may have a history of email sending by one or multiple email sending programs. It is important to gain a complete understanding of how the domain and its subdomains are being used for email sending before publishing a restrictive DMARC policy (p=quarantine or p=reject) because doing so would affect any unauthenticated email sending programs using the domain in the “From” header of messages. To get started with the DMARC implementation, these are a few actions to take:

  • Publish a p=none DMARC policy (sometimes referred to as monitoring mode), and set the rua tag to the location in which you would like to receive aggregate reports.
  • Analyze the aggregate reports. Mail receiving organizations will send reports which contain information to determine if the domain, and its subdomains, are being used for sending email, and how the messages are (or are not) being authenticated with a DKIM or SPF domain-aligned identifier. An easy to use analysis tool is the Dmarcian XML to Human Converter.
  • Avoid prematurely publishing a “p=quarantine” or “p=reject” policy. Doing so may result in blocked or reduced delivery of legitimate messages of existing email sending programs.

The image below illustrates how DMARC will be applied to an email received by the email receiving server and actions taken based on the enforcement policy:

DMARC flow Figure 1 – DMARC Flow

How do SPF and DKIM cause DMARC policies to pass

When you start sending emails using Amazon SES, messages that you send through Amazon SES automatically use a subdomain of amazonses.com as the default MAIL FROM domain. SPF evaluators will see that these messages pass the SPF policy evaluation because the default MAIL FROM domain has a SPF policy which includes the IP addresses of the SES infrastructure that sent the message. SPF authentication will result in an “SPF=PASS” and the authenticated identifier is the domain of the MAIL FROM address. The published SPF record applies to every message that is sent using SES regardless of whether you are using a shared or dedicated IP address. The amazonses.com SPF record lists all shared and dedicated IP addresses, so it is inclusive of all potential IP addresses that may be involved with sending email as the MAIL FROM domain. You can use ‘dig’ to look up the IP addresses that SES will use to send email:

dig txt amazonses.com | grep "v=spf1" amazonses.com. 850 IN TXT "v=spf1 ip4:199.255.192.0/22 ip4:199.127.232.0/22 ip4:54.240.0.0/18 ip4:69.169.224.0/20 ip4:23.249.208.0/20 ip4:23.251.224.0/19 ip4:76.223.176.0/20 ip4:54.240.64.0/19 ip4:54.240.96.0/19 ip4:52.82.172.0/22 ip4:76.223.128.0/19 -all"

Custom MAIL FROM domains

It is best practice for customers to configure a custom MAIL FROM domain, and not use the default amazonses.com MAIL FROM domain. The custom MAIL FROM domain will always be a subdomain of the customer’s verified domain identity. Once you configure the MAIL FROM domain, messages sent using SES will continue to result in an “SPF=PASS” as it does with the default MAIL FROM domain. Additionally, DMARC authentication will result in “DMARC=PASS” because the MAIL FROM domain and the domain in the “From” header are in alignment. It’s important to understand that customers must use a custom MAIL FROM domain if they want “SPF=PASS” to result in a “DMARC=PASS”.

For example, an Amazon SES-verified example.com domain will have the custom MAIL FROM domain “bounce.example.com”. The configured SPF record will be:

dig txt bounce.example.com | grep "v=spf1" "v=spf1 include:amazonses.com ~all"

Note: The chosen MAIL FROM domain could be any sub-domain of your choice. If you have the same domain identity configured in multiple regions, then you should create region-specific custom MAIL FROM domains for each region. e.g. bounce-us-east-1.example.com and bounce-eu-west-2.example.com so that asynchronously bounced messages are delivered directly to the region from which the messages were sent.

DKIM results in DMARC pass

For customers that establish Amazon SES Domain verification using DKIM signatures, DKIM authentication will result in a DKIM=PASS, and DMARC authentication will result in “DMARC=PASS” because the domain that publishes the DKIM signature is aligned to the domain in the “From” header (the SES domain identity).

DKIM and SPF together

Email messages are fully authenticated when the messages pass both DKIM and SPF, and both DKIM and SPF authenticated identifiers are domain-aligned. If only DKIM is domain-aligned, then the messages will still pass the DMARC policy, even if the SPF “pass” is unaligned. Mail receivers will consider the full context of SPF and DKIM when determining how they will handle the disposition of the messages you send, so it is best to fully authenticate your messages whenever possible. Amazon SES has taken care of the heavy lifting of the email authentication process away from our customers, and so, establishing SPF, DKIM and DMARC authentication has been reduced to a few clicks which allows SES customers to get started easily and scale fast.

Why is DMARC failing?

There are scenarios when you may notice that messages fail DMARC, whether your messages are fully authenticated, or partially authenticated. The following are things that you should look out for:

Email Content Modification

Sometimes email content is modified during the delivery to the recipients’ mail servers. This modification could be as a result of a security device or anti-spam agent along the delivery path (for example: the message Subject may be modified with an “[EXTERNAL]” warning to recipients). The modified message invalidates the DKIM signature which causes a DKIM failure. Remember, the purpose of DKIM is to ensure that the content of an email has not been tampered with during the delivery process. If this happens, the DKIM authentication will fail with an authentication error similar to “DKIM-signature body hash not verified“.

Solutions:

  • If you control the full path that the email message will traverse from sender to recipient, ensure that no intermediary mail servers modify the email content in transit.
  • Ensure that you configure a custom MAIL FROM domain so that the messages have a domain-aligned SPF identifier.
  • Keep the DMARC policy in monitoring mode (p=none) until these issues are identified/solved.

Email Forwarding

Email Forwarding There are multiple scenarios in which a message may be forwarded, and they may result in both/either SPF and DKIM failing to produce a domain-aligned authenticated identifier. For SPF, it means that the forwarding mail server is not listed in the MAIL FROM domain’s SPF policy. It is best practice for a forwarding mail server to avoid SPF failures and assume responsibility of mail handling for the messages it forwards by rewriting the MAIL FROM address to be in the domain controlled by the forwarding server. Forwarding servers that do not rewrite the MAIL FROM address pose a risk of impersonation attacks and phishing. Do not add the IP addresses of forwarding servers to your MAIL FROM domain’s SPF policy unless you are in complete control of all sources of mail being forwarded through this infrastructure. For DKIM, it means that the messages are being modified in some way that causes DKIM signature validation failure (see Email Content Modification section above). A responsible forwarding server will rewrite the MAIL FROM domain so that the messages pass SPF with a non-aligned authenticated identifier. These servers will attempt to forward the message without alteration in order to preserve DKIM signatures, but that is sometimes challenging to do in practice. In this scenario, since the messages carry no domain-aligned authenticated identifier, the messages will fail the DMARC policy.

Solution:

  • Email forwarding is an expected type of failure of which you will see in the DMARC aggregate reports. The domain owner must weigh the risk of causing forwarded messages to be rejected against the risk of not publishing a reject DMARC policy. Reference 8.6. Interoperability Considerations. Forwarding servers that wish to forward messages that they know will result in a DMARC failure will commonly rewrite the “From” header address of messages it forwards so that the messages pass a DMARC policy for a domain that the forwarding server is responsible for. The way to identify forwarding servers that rewrite the “From” header in this situation is to publish “p=quarantine pct=0 t=y” in your domain’s DMARC policy before publishing “p=reject”.

Multiple email sending providers are sending using the same domain

Multiple email sending providers: There are situations where an organization will have multiple business units sending email using the same domain, and these business units may be using an email sending provider other than SES. If neither SPF nor DKIM is configured with domain-alignment for these email sending providers, you will see DMARC failures in the DMARC aggregate report.

Solution:

  • Analyze the DMARC aggregate reports to identify other email sending providers, track down the business units responsible for each email sending program, and follow the instructions offered by the email sending provider about how to configure SPF and DKIM to produce a domain-aligned authenticated identifier.

What does a DMARC aggregate report look like?

The following XML example shows the general format of a DMARC aggregate report that you will receive from participating email service providers.

<?xml version="1.0" encoding="UTF-8" ?> 
<feedback> 
  <report_metadata> 
    <org_name>email-service-provider-domain.com</org_name> 
    <email>[email protected]</email> 
    <extra_contact_info>https://email-service-provider-domain.com/> 
    <report_id>620501112281841510</report_id> 
    <date_range> 
      <begin>1685404800</begin> 
      <end>1685491199</end> 
    </date_range> 
  </report_metadata> 
  <policy_published> 
    <domain>example.com</domain>
    <adkim>r</adkim> 
    <aspf>r</aspf> 
    <p>none</p> 
    <sp>none</sp> 
    <pct>100</pct> 
  </policy_published> 
  <record> 
    <row> 
      <source_ip>192.0.2.10</source_ip>
      <count>1</count> 
      <policy_evaluated> 
        <disposition>none</disposition> 
        <dkim>pass</dkim> 
        <spf>fail</spf> 
      </policy_evaluated> 
    </row> 
    <identifiers> 
      <header_from>example.com</header_from>
    </identifiers> 
    <auth_results> 
      <dkim> 
        <domain>example.com</domain> 
        <result>pass</result> 
        <selector>gm5h7da67oqhnr3ccji35fdskt</selector> 
      </dkim> 
      <dkim> 
        <domain>amazonses.com</domain> 
        <result>pass</result> 
        <selector>224i4yxa5dv7c2xz3womw6peua</selector> 
      </dkim> 
      <spf> 
        <domain>amazonses.com</domain> 
        <result>pass</result> 
      </spf> 
    </auth_results> 
  </record> 
</feedback> 

 

How to address DMARC deployment for domains confirmed to be unused for email (dangling or otherwise)

Deploying DMARC for unused or dangling domains is a proactive step to prevent abuse or unauthorized use of your domain. Once you have confirmed that all subdomains being used for sending email have the desired DMARC policies, you can publish a ‘p=reject’ tag on the organizational domain, which will prevent unauthorized usage of unused subdomains without the need to publish DMARC policies for every conceivable subdomain. For more advanced subdomain policy scenarios, read the “tree walk” definitions in https://datatracker.ietf.org/doc/draft-ietf-dmarc-dmarcbis/

Conclusion:

In conclusion, DMARC is not only a technology but also a commitment to email security, integrity, and trust. By embracing DMARC best practices, organizations can protect their users, maintain a positive brand reputation, and ensure seamless email deliverability. Every message from SES passes SPF and DKIM for “amazonses.com”, but the authenticated identifiers are not always in alignment with the domain in the “From” header which carries the DMARC policy. If email authentication is not fully configured, your messages are susceptible to delivery issues like spam filtering, or being rejected or blocked by the recipient ESP. As a best practice, you can configure both DKIM and SPF to attain optimum deliverability while sending email with SES.

 

About the Authors

Bruno Giorgini Bruno Giorgini is a Senior Solutions Architect specializing in Pinpoint and SES. With over two decades of experience in the IT industry, Bruno has been dedicated to assisting customers of all sizes in achieving their objectives. When he is not crafting innovative solutions for clients, Bruno enjoys spending quality time with his wife and son, exploring the scenic hiking trails around the SF Bay Area.
Jesse Thompson Jesse Thompson is an Email Deliverability Manager with the Amazon Simple Email Service team. His background is in enterprise IT development and operations, with a focus on email abuse mitigation and encouragement of authenticity practices with open standard protocols. Jesse’s favorite activity outside of technology is recreational curling.
Sesan Komaiya Sesan Komaiya is a Solutions Architect at Amazon Web Services. He works with a variety of customers, helping them with cloud adoption, cost optimization and emerging technologies. Sesan has over 15 year’s experience in Enterprise IT and has been at AWS for 5 years. In his free time, Sesan enjoys watching various sporting activities like Soccer, Tennis and Moto sport. He has 2 kids that also keeps him busy at home.
Mudassar Bashir Mudassar Bashir is a Solutions Architect at Amazon Web Services. He has over ten years of experience in enterprise software engineering. His interests include web applications, containerization, and serverless technologies. He works with different customers, helping them with cloud adoption strategies.
Priya Priya Singh is a Cloud Support Engineer at AWS and subject matter expert in Amazon Simple Email Service. She has a 6 years of diverse experience in supporting enterprise customers across different industries. Along with Amazon SES, she is a Cloudfront enthusiast. She loves helping customers in solving issues related to Cloudfront and SES in their environment.

 

Handling Bounces and Complaints

Post Syndicated from Tyler Holmes original https://aws.amazon.com/blogs/messaging-and-targeting/handling-bounces-and-complaints/

As you may have seen in Jeff Barr’s blog post or in an announcement, Amazon Simple Email Service (Amazon SES) now provides bounce and complaint notifications via Amazon Simple Notification Service (Amazon SNS). You can refer to the Amazon SES Developer Guide or Jeff’s post to learn how to set up this feature. In this post, we will show you how you might manage your email list using the information you get in the Amazon SNS notifications.

Background

Amazon SES assigns a unique message ID to each email that you successfully submit to send. When Amazon SES receives a bounce or complaint message from an ISP, we forward the feedback message to you. The format of bounce and complaint messages varies between ISPs, but Amazon SES interprets these messages and, if you choose to set up Amazon SNS topics for them, categorizes them into JSON objects.

Scenario

Let’s assume you use Amazon SES to send monthly product announcements to a list of email addresses. You store the list in a database and send one email per recipient through Amazon SES. You review bounces and complaints once each day, manually interpret the bounce messages in the incoming email, and update the list. You would like to automate this process using Amazon SNS notifications with a scheduled task.

Solution

To implement this solution, we will use separate Amazon SNS topics for bounces and complaints to isolate the notification channels from each other and manage them separately. Also, since the bounce and complaint handler will not run 24/7, we need these notifications to persist until the application processes them. Amazon SNS integrates with Amazon Simple Queue Service (Amazon SQS), which is a durable messaging technology that allows us to persist these notifications. We will configure each Amazon SNS topic to publish to separate SQS queues. When our application runs, it will process queued notifications and update the email list. We have provided sample C# code below.

Configuration

Set up the following AWS components to handle bounce notifications:

  1. Create an Amazon SQS queue named ses-bounces-queue.
  2. Create an Amazon SNS topic named ses-bounces-topic.
  3. Configure the Amazon SNS topic to publish to the SQS queue.
  4. Configure Amazon SES to publish bounce notifications using ses-bounces-topic to ses-bounces-queue.

Set up the following AWS components to handle complaint notifications:

  1. Create an Amazon SQS queue named ses-complaints-queue.
  2. Create an Amazon SNS topic named ses-complaints-topic.
  3. Configure the Amazon SNS topic to publish to the SQS queue.
  4. Configure Amazon SES to publish complaint notifications using ses-complaints-topic to ses-complaints-queue.

Ensure that IAM policies are in place so that Amazon SNS has access to publish to the appropriate SQS queues.

Bounce Processing

Amazon SES will categorize your hard bounces into two types: permanent and transient. A permanent bounce indicates that you should never send to that recipient again. A transient bounce indicates that the recipient’s ISP is not accepting messages for that particular recipient at that time and you can retry delivery in the future. The amount of time you should wait before resending to the address that generated the transient bounce depends on the transient bounce type. Certain transient bounces require manual intervention before the message can be delivered (e.g., message too large or content error). If the bounce type is undetermined, you should manually review the bounce and act accordingly.

You will need to define some classes to simplify bounce notification parsing from JSON into .NET objects. We will use the open-source JSON.NET library.

/// <summary>Represents the bounce or complaint notification stored in Amazon SQS.</summary>
class AmazonSqsNotification
{
    public string Type { get; set; }
    public string Message { get; set; }
}

/// <summary>Represents an Amazon SES bounce notification.</summary>
class AmazonSesBounceNotification
{
    public string NotificationType { get; set; }
    public AmazonSesBounce Bounce { get; set; }
}
/// <summary>Represents meta data for the bounce notification from Amazon SES.</summary>
class AmazonSesBounce
{
    public string BounceType { get; set; }
    public string BounceSubType { get; set; }
    public DateTime Timestamp { get; set; }
    public List<AmazonSesBouncedRecipient> BouncedRecipients { get; set; }
}
/// <summary>Represents the email address of recipients that bounced
/// when sending from Amazon SES.</summary>
class AmazonSesBouncedRecipient
{
    public string EmailAddress { get; set; }
}

Sample code to handle bounces:

/// <summary>Process bounces received from Amazon SES via Amazon SQS.</summary>
/// <param name="response">The response from the Amazon SQS bounces queue 
/// to a ReceiveMessage request. This object contains the Amazon SES  
/// bounce notification.</param> 
private static void ProcessQueuedBounce(ReceiveMessageResponse response)
{
    int messages = response.ReceiveMessageResult.Message.Count;
 
    if (messages > 0)
    {
        foreach (var m in response.ReceiveMessageResult.Message)
        {
            // First, convert the Amazon SNS message into a JSON object.
            var notification = Newtonsoft.Json.JsonConvert.DeserializeObject<AmazonSqsNotification>(m.Body);
 
            // Now access the Amazon SES bounce notification.
            var bounce = Newtonsoft.Json.JsonConvert.DeserializeObject<AmazonSesBounceNotification>(notification.Message);
 
            switch (bounce.Bounce.BounceType)
            {
                case "Transient":
                    // Per our sample organizational policy, we will remove all recipients 
                    // that generate an AttachmentRejected bounce from our mailing list.
                    // Other bounces will be reviewed manually.
                    switch (bounce.Bounce.BounceSubType)
                    {
                        case "AttachmentRejected":
                            foreach (var recipient in bounce.Bounce.BouncedRecipients)
                            {
                                RemoveFromMailingList(recipient.EmailAddress);
                            }
                            break;
                        default:
                            ManuallyReviewBounce(bounce);
                            break;
                    }
                    break;
                default:
                    // Remove all recipients that generated a permanent bounce 
                    // or an unknown bounce.
                    foreach (var recipient in bounce.Bounce.BouncedRecipients)
                    {
                        RemoveFromMailingList(recipient.EmailAddress);
                    }
                    break;
            }
        }
    }
}

Complaint Processing

A complaint indicates the recipient does not want the email that you sent them. When we receive a complaint, we want to remove the recipient addresses from our list. Again, define some objects to simplify parsing complaint notifications from JSON to .NET objects.

/// <summary>Represents an Amazon SES complaint notification.</summary>
class AmazonSesComplaintNotification
{
    public string NotificationType { get; set; }
    public AmazonSesComplaint Complaint { get; set; }
}
/// <summary>Represents the email address of individual recipients that complained 
/// to Amazon SES.</summary>
class AmazonSesComplainedRecipient
{
    public string EmailAddress { get; set; }
}
/// <summary>Represents meta data for the complaint notification from Amazon SES.</summary>
class AmazonSesComplaint
{
    public List<AmazonSesComplainedRecipient> ComplainedRecipients { get; set; }
    public DateTime Timestamp { get; set; }
    public string MessageId { get; set; }
}

Sample code to handle complaints is:

/// <summary>Process complaints received from Amazon SES via Amazon SQS.</summary>
/// <param name="response">The response from the Amazon SQS complaint queue 
/// to a ReceiveMessage request. This object contains the Amazon SES 
/// complaint notification.</param>
private static void ProcessQueuedComplaint(ReceiveMessageResponse response)
{
    int messages = response.ReceiveMessageResult.Message.Count;
 
    if (messages > 0)
    {
        foreach (var
  message in response.ReceiveMessageResult.Message)
        {
            // First, convert the Amazon SNS message into a JSON object.
            var notification = Newtonsoft.Json.JsonConvert.DeserializeObject<AmazonSqsNotification>(message.Body);
 
            // Now access the Amazon SES complaint notification.
            var complaint = Newtonsoft.Json.JsonConvert.DeserializeObject<AmazonSesComplaintNotification>(notification.Message);
 
            foreach (var recipient in complaint.Complaint.ComplainedRecipients)
            {
                // Remove the email address that complained from our mailing list.
                RemoveFromMailingList(recipient.EmailAddress);
            }
        }
    }
}

Final Thoughts

We hope that you now have the basic information on how to use bounce and complaint notifications. For more information, please review our API reference and Developer Guide; it describes all actions, error codes and restrictions that apply to Amazon SES.

If you have comments or feedback about this feature, please post them on the Amazon SES forums. We actively monitor the forum and frequently engage with customers. Happy sending with Amazon SES!

How to secure your email account and improve email sender reputation

Post Syndicated from bajavani original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-secure-your-email-account-and-improve-email-sender-reputation/

How to secure your email account and improve email sender reputation

Introduction

Amazon Simple Email Service (Amazon SES) is a cost-effective, flexible, and scalable email service that enables customers to send email from within any application. You can send email using the SES SMTP interface or via HTTP requests to the SES API. All requests to send email must be authenticated using either SMTP or IAM credentials and it is when these credentials end up in the hands of a malicious actor, that customers need to act fast to secure their SES account.

Compromised credentials with permission to send email via SES allows the malicious actor to use SES to send spam and or phishing emails, which can lead to high bounce and or complaint rates for the SES account. A consequence of high bounce and or complaint rates can result in sending for the SES account being paused.

How to identify if your SES email sending account is compromised

Start by checking the reputation metrics for the SES account from the Reputation metrics menu in the SES Console.
A sudden increase or spike in the bounce or complaint metrics should be further investigated. You can start by checking the Feedback forwarding destination, where SES will send bounce and or complaints to. Feedback on bounces and complaints will contain the From, To email addresses as well as the subject. Use these attributes to determine if unintended emails are being sent, for example if the bounce and / or complaint recipients are not known to you that is an indication of compromise. To find out what your feedback forwarding destination is, please see Feedback forwarding mechanism

If SNS notifications are already enabled, check the subscribed endpoint for the bounce and / or complaint notifications to review the notifications for unintended email sending. SNS notifications would provide additional information, such as IAM identity being used to send the emails as well as the source IP address the emails are being sent from.

If the review of the bounces or complaints leads to the conclusion that the email sending is unintended, immediately follow the steps below to secure your account.

Steps to secure your account:

You can follow the below steps in order to secure your SES account:

  1. It is recommended that to avoid any more unintended emails from being sent, to immediately pause the SES account until the root cause has been identified and steps taken to secure the SES account. You can use the below command to pause the email sending for your account:

    aws ses update-account-sending-enabled --no-enabled --region sending_region

    Note: Change the sending_region with the region you are using to send email.

  2. Rotate the credentials for the IAM identity being used to send the unintended emails. If the IAM identity was originally created from the SES Console as SMTP credentials, it is recommended to delete the IAM identity and create new SMTP credentials from the SES Console.
  3. Limit the scope of SMTP/IAM identity to send email only from the specific IP address your email sending originates from.

See controlling access to Amazon SES.

Below is an example of an IAM policy which allows emails from IP Address 1.2.3.4 and 5.6.7.8 only.

————————-

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RestrictIP",
"Effect": "Allow",
"Action": "ses:SendRawEmail",
"Resource": "*",
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"1.2.3.4/32",
"5.6.7.8/32"
]
}
}
}
]
}

———————————

When you send an email from IP address apart from the IP mentioned in the policy, then the following error will be observed and the email sending request will fail:

———-

554 Access denied: User arn:aws:iam::123456789012:user/iam-user-name’ is not authorized to perform ses:SendRawEmail’ on resource `arn:aws:ses:eu-west-1:123456789012:identity/example.com’

———-

4.  Once these steps have been taken, the sending for the account can be enabled again, using the command below:

aws ses update-account-sending-enabled --enabled --region sending_region

Conclusion

You can secure your SES email sending account by taking the necessary steps mentioned and also prevent this from happening in the future.

Deploy Amazon QuickSight dashboard for Amazon Pinpoint engagement events.

Post Syndicated from Pavlos Ioannou Katidis original https://aws.amazon.com/blogs/messaging-and-targeting/deploy-amazon-quicksight-dashboard-for-amazon-pinpoint-engagement-events/

Abstract

Business intelligence (BI) dashboards provide a graphical representation of metrics and key performance indicators (KPIs) to monitor the health of your business. By leveraging BI dashboards to analyze the performance of your customer communications, you gain valuable insights into how they are engaging with your messages and can make data-driven decisions to improve your marketing and communication strategies.

In this blog post we introduce a solution that automates the deployment of an Amazon QuickSight dashboard that enables marketers to analyze their long-term Amazon Pinpoint customer engagement data. These dashboards can be customized further depending the use case. This solution alleviates the need to create data pipelines for storage and analysis of Amazon Pinpoint’s engagement data, while offering a greater variety of widgets and views across email, SMS, campaigns, journeys and transactional messages when comparing to Amazon Pinpoint’s native dashboards.

Amazon Pinpoint is a flexible, scalable marketing communications service that connects you with customers over email, SMS, push notifications, or voice. The service offers ready to use dashboards to view key performance indicators (KPIs) for the various messaging channels, It provides 90 days of events for analysis. However, the raw events used to populate Amazon Pinpoint’s dashboards, can be streamed using Amazon Kinesis Data Firehose to a destination of your choice. This blog will walk you through leveraging this feature to create a data lake to store and analyze data beyond the initial 90 days.

Amazon QuickSight is a cloud-scale business intelligence (BI) service that you can use to deliver easy-to-understand insights in an interactive visual environment.

The solutions leverages the Amazon Cloud Development Kit (CDK) to deploy the needed infrastructure and dashboards.

Use Case(s)

The Amazon QuickSight dashboards deployed through this solution are designed to serve several use cases. Here are just a few examples:

  • View email and SMS costs per Campaign and Journey.
  • Deep dive into engagement insights and performance. (eg: SMS events, Email events, Campaign events, Journey events).
  • Schedule reports to various business stakeholders.
  • Track individual email & SMS statuses to specific endpoints.
  • Analyze open and click rates based on the message send time.

These are some of the use cases you can use these dashboards for and with all the data points being available in Amazon QuickSight, you can create your own views and widgets based on your specific requirements.

Solution Overview

This solution builds upon the Digital User Engagement (DUE) Event Database AWS Solution. It creates a long-term Amazon Pinpoint event data lake. This solution also builds a QuickSight dashboard to visualize and analyze this data. It leverages several other AWS services to tie i all together. It uses AWS Lambda for processing AWS CloudTrail data, Amazon Athena to build views using SQL for Amazon QuickSight, AWS CloudTrail to record any new campaign, journey and segment updates and Amazon DynamoDB to store the campaign, journey and segment metadata. This solution can be segmented into three logical portions: 1) Pinpoint campaign/journey/segment lookup tables. 2) Amazon Athena Views. 3) Amazon QuickSight resources.

The AWS Cloud Development Kit (CDK) is used to deploy this solution to your account. AWS CDK is an open-source software development framework for defining cloud infrastructure as code with modern programming languages and deploying it through AWS CloudFormation.

Pinpoint campaign/journey/segment lookup tables

architecture-diagram

  1. A CloudFormation AWS Lambda-backed custom resource function adds current Pinpoint campaign, journey and segment meta data to Amazon DynamoDB lookup tables. An AWS CloudFormation custom resource is managed by a Lambda function that runs only upon the deployment, update and deletion of the AWS CloudFormation stack.
  2. AWS CloudTrail logs record API actions to an S3 bucket every 5 minutes.
  3. When an AWS CloudTrail log is written to the S3 bucket an AWS Lambda function is invoked and checks for Amazon Pinpoint campaigns/journeys/segments management events such as create, update and delete.
  4. For every Amazon Pinpoint action the AWS Lambda function finds, it queries Amazon Pinpoint to get the respective resource details.
  5. The AWS Lambda function will create or update records in the Amazon DynamoDB table to reflect the changes.
  6. This solution also deploys an Amazon Athena DynamoDB connector. Amazon Athena uses this to query the Amazon DynamoDB lookup tables to enrich the data in the Amazon Pinpoint event data lake.
  7. The Amazon Athena to Amazon DynamoDB connector requires an Amazon S3 spill bucket for any data that exceeds the AWS Lambda function limits

Amazon Athena views

Amazon Athena views are crucial for querying and organizing the data. These views allow QuickSight to interact with the Pinpoint event data lake through standard SQL queries and views. Here’s how they’re set up:

The application creates several named queries (called saved queries in the Amazon Athena console). Each named query uses a SQL statement to create a database view containing a subset of the data from the Pinpoint event data lake (or joins data from a previous view with the Amazon DynamoDB tables created above. The views are also created using an AWS Lambda-backed custom resource.

Amazon QuickSight resources

quicksight-resources-diagram

  1. This solution creates several Amazon QuickSight resources to support the deployed dashboard. These include data sources, datasets, refresh schedules, and an analysis. The refresh schedule determines the frequency that Amazon QuickSight queries the Amazon Athena views to update the datasets.
  2. Amazon Athena retrieves live data from the DUE event database data lake and the Athena DynamoDB Connector whenever the Amazon QuickSight refresh schedule runs.

Prerequisites

  • Deploy the Digital User Engagement (DUE) Event Database solution before continuing
    • After you have deployed this solution, gather the following data from the stack’s Resources section.
      • DUES3DataLake: You will need the bucket name
      • PinpointProject: You will need the project Id
      • PinpointEventDatabase: This is the name of the Glue Database. You will only need this if you used something other than the default of due_eventdb

Note: If you are installing the DUE event database for the first time as part of these instructions, your dashboard will not have any data to show until new events start to come in from your Amazon Pinpoint project.

Once you have the DUE event database installed, you are ready to begin your deployment.

Implementation steps

Step 1 – Ensure that Amazon Athena is setup to store query results

Amazon Athena uses workgroups to separate users, teams, applications, or workloads, to set limits on amount of data each query or the entire workgroup can process, and to track costs. There is a default workgroup called “primary” However, before you can use this workgroup, it needs to be configured with an Amazon S3 bucket for storing the query results.

  1. If you do not have an existing S3 bucket you can use for the output, create a new Amazon S3 bucket.
  2. Navigate to the Amazon Athena console and from the menu select workgroups > primary > Edit > Query result configuration
    1. Select the Amazon S3 bucket and any specific directory for the Athena query result location

Note: If you choose to use a workgroup other that the default “primary” workgroup. Please take note of the workgroup name to be used later.

Step 2 – Enable Amazon QuickSight

Amazon QuickSight offers two types of data sets: Direct Query data sets, which provides real-time access to data sources, and SPICE (Super-fast, Parallel, In-memory Calculation Engine) data sets, which are pre-aggregated and cached for faster performance and scalability that can be refreshed on a schedule.

This solution uses SPICE datasets set to incrementally refresh on a cycle of your choice (Daily or Hourly). If you have already setup Amazon QuickSight, please navigate to Amazon QuickSight in the AWS Console and skip to step 3.

  1. Navigate to Amazon QuickSight on the AWS console
  2. Setup Amazon QuickSight account by clicking the “Sign up for QuickSight” button.
    1. You will need to setup an Enterprise account for this solution.
    2. To complete the process for the Amazon QuickSight account setup follow the instructions at this link
  3. Ensure you have the Admin Role
    1. Choose the profile icon in the top right corner, select Manage QuickSight and click on Manage Users
    2. Subscription details should display on the screen.
  4. Ensure you have enough SPICE capacity for the datasets
    1. Choose the profile icon, and then select Manage QuickSight
    2. Click on SPICE Capacity
  5. Make sure you enough SPICE for all three datasets
    1. if you are still in the free tier, you should have enough for initial testing.
    2. You will need about 2GB of capacity for every 1,000,000 Pinpoint events that will be ingested in to SPICE
    3. Note: If you do not have enough SPICE capacity, deployment will fail
  6. Please note the Amazon QuickSight username. You can find this by clicking profile icon. Example username: Admin/user-name

Step 3 – Collect the Amazon QuickSight Service Role name in IAM

For Amazon Athena, Amazon S3, and Athena Query Federation connections, Amazon QuickSight uses the following IAM “consumer” role by default: aws-quicksight-s3-consumers-role-v0

If the “consumer” role is not present, then QuickSight uses the following “service” role instead : aws-quicksight-service-role-v0.

The version number at the end of the role could be different in your account. Please validate your role name with the following steps.

  1. Navigate to the Identity and Access Management (IAM) console
  2. Go to Roles and search QuickSight
  3. If the consumer role exists, please note its full name
  4. If you only find the service role, please note its full name

Note: For more details on these service roles, please see the QuickSight User Guide

Step 4 – Prepare the CDK Application

Deploying this solution requires no previous experience with the AWS CDK toolkit. If you would like to familiarize yourself with CDK, the AWS CDK Workshop is a great place to start.

  1. Setup your integrated development environment (IDE)
    1. Option 1 (recommended for first time CDK users): Use AWS Cloud9 – a cloud-based IDE that lets you write, run, and debug your code with just a browser
      1. Navigate to Cloud9 in the AWS console and click the Create Environment button
      2. Provide a descriptive name to your environment (e.g. PinpointAnalysis)
      3. Leave the rest of the values as their default values and click Create
      4. Open the Cloud9 IDE
        1. Node, TypeScript, and CDK should be come pre-installed. Test this by running the following commands in your terminal.
          1. node --version
          2. tsc --version
          3. cdk --version
          4. If dependencies are not installed, follow the Step 1 instructions from this article
        2. Using AWS Cloud 9 will incur a nominal charge if you are no longer Free Tier eligible. However, using AWS Cloud9 will simply setup if you do not already have a local environment with AWS CDK and the AWS CLI installed
    2. Option 2: local IDE such as VS Code
      1. Setup CDK locally using this documentation
      2. Install Node, TypeScript and the AWS CLI
        1. Once the CLI is installed, configure your AWS credentials
          1. aws configure
  2. Clone the Pinpoint Dashboard Solution from your terminal by running the command below:
    1. git clone https://github.com/aws-samples/digital-user-engagement-events-dashboards.git
  3. Install the required npm packages from package.json by running the commands below:
    1. cd digital-user-engagement-events-dashboards
    2. npm install

Open the file at digital-user-engagement-events-dashboards/bin/pinpoint-bi-analysis.ts for editing in your IDE.

Edit the following code block your your solution with the information you have gathered in the previous steps. Please reference Table 1 for a description of each editable field.

const resourcePrefix = "pinpoint_analytics_";

...

new MainApp(app, "PinpointAnalytics", {
  env: {
    region: "us-east-1",
  }
  
  //Attributes to change
  dueDbBucketName: "{bucket-name}",
  pinpointProjectId: "{pinpoint-project-id}",
  qsUserName: "{quicksight-username}",

  //Default settings
  athenaWorkGroupName: "primary",
  dataLakeDbName: "due_eventdb",
  dateRangeNumberOfMonths: 6,
  qsUserRegion: "us-east-1", 
  qsDefaultServiceRole: "aws-quicksight-service-role-v0", 
  spiceRefreshInterval: "HOURLY",

  //Constants
  athena_util: athena_util,
  qs_util: qs_util,
});
Attribute Definition Example
resourcePrefix The prefix for all created Athena and QuickSight resources pinpoint_analytics_
region Where new resources will be deployed. This must be the same region that the DUE event database solution was deployed us-east-1
dueDbBucketName The name of the DUE event database S3 Bucket due-database-xxxxxxxxxxus-east-1
qsUserName The name of your QuickSight User Admin/my-user
athenaWorkGroupName The Athena workgroup that was previously configured primary
dataLakeDbName The Glue database created during the DUE event database solution. By default the database name is “due_eventdb” due_eventdb
dateRangeNumberOfMonths The number of months of data the Athena views will contain. QuickSight SPICE datasets will contain this many months of data initially and on full refresh. The QuickSight dataset will add new data incrementally without deleting historical data. 6
qsUserRegion The region where your quicksight user exists. By default, new users will be created in us-east-1. You can check your user location with the AWS CLI: aws quicksight list-users --aws-account-id {accout-id} --namespace default and look for the region in the arn us-east-1
qsDefaultServiceRole The service role collected during Step 3. aws-quicksight-service-role-v0
spiceRefreshInterval Options Include HOURLY, DAILY – This is how often the SPICE 7-day incremental window will be refreshed DAILY

Step 5 – Deploy

  1. CDK requires you to bootstrap in each region of an account. This creates a S3 bucket for deployment. You only need to bootstrap once per account/region
    1. cdk bootstrap
  2. Deploy the application
    1. cdk deploy

Step 6 – Explore

Once your solution deploys, look for the Outputs provided by the CDK CLI. You will find a link to your new Amazon Quicksight Analysis, or Dashboard, as well as a few other key resources. Also, explore the resources sections of the deployed stacks in AWS CloudFormation for a complete list of deployed resources. In the AWS CloudFormation, you should have two stacks. The main stack will be called PinpointAnalytics and a nested stack.

Pricing

The total cost to run this solution will depend on several factors. To help explore what the costs might look like for you, please look at the following examples.

All costs outlined below will assume the following:

  • 1 Amazon QuickSight author
  • 100 Amazon QuickSight analysis reader sessions
  • 100k write API actions for all services in AWS account
  • A total of 1k Amazon Pinpoint campaigns, journeys, and segments resulting in 1k Amazon DynamoDB records
  • 5 million monthly Amazon Pinpoint events – email send, email delivered, etc.

Base Costs:

  • 1 Amazon QuickSight author – can edit all Amazon QuickSight resources
    • $24 – There is a a 30 day trial for 4 authors in the free tier
  • 100 Amazon QuickSight analysis reader sessions OR 6 readers with unlimited access – max $5 per month per reader
    • $30
  • Total Monthly Costs: $54 / month

Variable Costs:

Even with the assumptions listed above, the costs will vary depending on the chosen data retention window as well as the the refresh schedule.

  • SPICE data storage costs.
    • Total size of storage will depend on how many months you choose to display in the dashboard
    • For the above assumptions, the SPICE datasets will cost roughly $3.25 for each month stored in the datasets.
  • Amazon Athena data volume costs
    • With Athena you are charged for the total number of bytes scanned in a query. The solution implements incremental data resfreshes in SPICE. Amazon QuickSight will only query and updates the most recent 7 days of data during each refresh cycle. This can be adjusted as needed.

Scenario 1 – 6-month data analysis with daily refresh:

  • Fixed costs: $57
  • SPICE datasets: $19.50
  • Athena Scans: $1.25
  • Total Costs: $77.75 / Month

Scenario 2 – 12-month data analysis with daily refresh:

  • Fixed costs: $57
  • SPICE datasets: $39
  • Athena Scans: $1.25
  • Total Costs: $97.25 / Month

Scenario 3 – 12-month data analysis with hourly refresh:

  • Fixed costs: $57
  • SPICE datasets: $39
  • Athena Scans: $27.50
  • Total Costs: $123.5 / Month

Note: Several services were not mentioned in the above scenarios (e.g., DynamoDB, Cloudtrail, Lambda, etc). The limited usage of these services resulted in a combined cost of less than a few US dollars per month. Even at a greater scale, the costs from these services will not increase in any significant way.

Clean up

  • Delete the CDK stack running the following from your command line
    • cdk destroy
  • Delete QuickSight account
  • Delete Athena views
    • Go to Glue > Data Catalog > Databases > Your Database Name
    • This should delete all Athena views no longer needed. Views created will start with the resourcePrefix specified in the bin/athena-quicksight-cdk.ts file
  • Delete S3 buckets
    • DynamoDB cloud watch log bucket
    • Dynamo Athena Connector Spill bucket
    • Athena workgroup output bucket
  • Delete DynamoDB tables
    • This solution creates two DynamoDB lookup tables prefixed with the Stack name

Conclusion

In this blog, you have deployed a solution that visualizes Amazon Pinpoint’s email and SMS engagement data using Amazon QuickSight. This solution provides you with an Amazon QuickSight functional dashboard as well as a foundation to design and build new Amazon QuickSight dashboards that meet your bespoke requirements. Parts of the solution, such as the Amazon Athena views, can be ingested with other business intelligence tools that your business might already be using.

Next steps

This solution can be expanded to include Amazon Pinpoint engagement events from other channels such as push notifications, Amazon Connect outbound calls, in-app and custom events. This will require certain updates on the Amazon Athena views and consequently on the Amazon QuickSight dashboards. Furthermore, the Amazon DynamoDB tables store only campaign, journey and segment meta-data. You can extend this part of the solution to include message template meta-data, which will help to analyze performance per message template.

Considerations / Troubleshooting

  • Pinpoint Standard account can be upgraded to an Enterprise account. Enterprise accounts cannot be downgraded to a Standard account.
  • SPICE capacity is allocated separately for each AWS Region. Default SPICE capacity is automatically allocated to your home AWS Region. For each AWS account, SPICE capacity is shared by all the people using QuickSight in a single AWS Region. The other AWS Regions have no SPICE capacity unless you choose to purchase some.
  • The QuickSight Analysis Event rates are calculated on Pinpoint message_id and endpoint_id grain – click rate will be the same if a user clicks an email link one or more than one times
  • All timestamps are in UTC. To display data in another timezone edit event_timestamp_timezone calculated field in every dataset
  • Data inside Amazon QuickSight will refresh depending on the schedule set during deployment. Current options include hourly and daily refreshes.
  • AWS CloudTrail has 5 cloudtrail trails per AWS account.

About the Authors

Spencer Harrison

Spencer Harrison

Spencer was a 2023 WWPS Solution Architect intern at Amazon Web Services. He will graduate with his Masters of Information Systems Management from Brigham Young University in the spring of 2024. After graduation he is aspiring to find opportunities as a solution architect, cloud engineer, or DevOps engineer. Outside of work, Spencer loves going outdoors to wake surf, downhill ski, and play pickle ball.

Daniel Wells

Daniel Wells

With over 20 years of IT experience, Daniel has held many architecture and director positions supporting a wide variety of technologies. He currently works as an AWS Solutions Architect supporting Education Technology companies striving to make a difference for learners and educators worldwide. Daniel’s interests outside of work include music, family, health, education and anything that allows him to express himself creatively.

Pavlos Ioannou Katidis

Pavlos Ioannou Katidis

Pavlos Ioannou Katidis is an Amazon Pinpoint and Amazon Simple Email Service Senior Specialist Solutions Architect at AWS. He enjoys diving deep into customers’ technical issues and help in designing communication solutions. In his spare time, he enjoys playing tennis, watching crime TV series, playing FPS PC games, and coding personal projects.