Tag Archives: *Post Types

Introducing Our Final AWS Heroes of 2025

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/introducing-our-final-aws-heroes-of-2025/

With AWS re:Invent approaching, we’re celebrating three exceptional AWS Heroes whose diverse journeys and commitment to knowledge sharing are empowering builders worldwide. From advancing women in tech and rural communities to bridging academic and industry expertise and pioneering enterprise AI solutions, these leaders exemplify the innovative spirit that drives our community forward. Their stories showcase how technical excellence, combined with passionate advocacy and mentorship, strengthens the global AWS community.

Dimple Vaghela – Ahmedabad, India

Community Hero Dimple Vaghela leads both the AWS User Group Ahmedabad and AWS User Group Vadodara, where she drives cloud education and technical growth across the region. Her impact spans organizing numerous AWS meetups, workshops, and AWS Community Days that have helped thousands of learners advance their cloud careers. Dimple launched the “Cloud for Her” project to empower girls from rural areas in technology careers and serves as co-organizer of the Women in Tech India User Group. Her exceptional leadership and community contributions were recognized at AWS re:Invent 2024 with the AWS User Group Leader Award in the Ownership category, while she continues building a more inclusive cloud community through speaking, mentoring, and organizing impactful tech events.

Rola Dali – Montreal, Canada

Community Hero Rola Dali is a senior Data, ML, and AI expert specializing in AWS cloud, bringing unique perspective from her PhD in neuroscience and bioinformatics with expertise in human genomics. As co-organizer of the AWS Montreal User Group and a former AWS Community Builder, her commitment to the cloud community earned her the prestigious Golden Jacket recognition in 2024. She actively shapes the tech community by architecting AWS solutions, sharing knowledge through blogs and lectures, and mentoring women entering tech, academics transitioning to industry, and students starting their careers.

Vivek Velso – Toronto, Canada

Machine Learning Hero Vivek Velso is a seasoned technology leader with over 27 years of IT industry experience, specializing in helping organizations modernize their cloud infrastructure for generative AI workloads. His deep AWS expertise earned him the prestigious Golden Jacket award for completing all AWS certifications, and he actively contributes to the AWS Subject Matter Expert (SME) program for multiple certification exams. A former AWS Community Builder and AWS Ambassador, he continues to share his knowledge through more than 100 technical blogs, articles, conference engagements, and AWS livestreams, helping the community confidently embrace cloud innovation.

Learn More

Visit the AWS Heroes webpage if you’d like to learn more about the AWS Heroes program, or to connect with a Hero near you.

Taylor

A Guide to Sending International SMS with US Toll-Free Numbers and AWS End User Messaging

Post Syndicated from Brett Ezell original https://aws.amazon.com/blogs/messaging-and-targeting/a-guide-to-sending-international-sms-with-us-toll-free-numbers-and-aws-end-user-messaging/

AWS End User Messaging now supports international SMS capabilities for US Toll-Free Numbers (TFNs). This new feature allows businesses to use a single US TFN to send SMS messages to over 150 countries, simplifying global outreach. It primarily benefits customers who need to send one-way transactional alerts—like one-time passwords (OTPs) or shipping notifications—and businesses that want to rapidly prototype and test their messaging strategy in new international markets without the overhead of procuring country-specific numbers.

This guide will walk you through the pros and cons of this feature and show you how to enable it and when to use it versus traditional, country-specific sending methods.

What Are International US Toll-Free Numbers?

An International US Toll-Free Number is a standard US TFN that has been enabled with the capability to send SMS messages to destinations outside of the United States. This feature is backward compatible, meaning you can enable it on any new or existing US TFNs in your account.

How to Enable International Sending

There are three primary ways to enable this feature for your US Toll-Free Numbers:

  • Enable international sending when registering a new number in the console.
  • Enable international sending for an existing number in the console.
  • Enable international sending for an existing number via the AWS CLI.

1. Enable When Registering a New US Toll-Free Number (Console)

  • From the AWS End User Messaging console, navigate to Manage SMS
  • From the AWS End User Messaging console, navigate to Configurations > Phone numbers > and select Request originator
  • Step 1: Select country, select the United States (US) as your destination country
  • Under Step 2: Define use case, configure the various options listed for your intended Messaging use case, and select Yes to enable International sending, prior to clicking Next
  • For Step 3: Select originator type, select Toll-free, validate your Resource policy choices, select Next
  • In Step 4: Review and request: Verify the information you entered is correct and select Request. Please note: US Toll-Free Number registration requests can take approximately 15 business days to be approved.

For more information, see Request a phone number in AWS End User Messaging SMS

2. Enable for an Existing US Toll-Free Number (Console or CLI)

If you have already acquired a TFN, you can enable the international sending feature at any time.

Using the AWS Management Console:

  • Navigate to Configurations > Phone numbers > and select an existing Toll-free number
  • Locate the International sending tab and choose Edit settings
  • Check the Enable international sending capability box in your phone number details
    • Save Changes

Using the AWS CLI

The update-phone-number command allows you to modify a phone number’s capabilities, while the describe-phone-numbers command allows you to verify its status.

1. To Enable International Sending:

Use the --international-sending-enabled flag

aws pinpoint-sms-voice-v2 update-phone-number \
    --phone-number-id "phone-a1b2c3d4e5f67890" \
    --international-sending-enabled \
    --region us-east-1

Note: Replace "phone-a1b2c3d4e5f67890" with your actual phone number’s ID

2. To Disable International Sending:

Use the --no-international-sending-enabled flag

aws pinpoint-sms-voice-v2 update-phone-number \
    --phone-number-id "phone-a1b2c3d4e5f67890" \
    --no-international-sending-enabled \
    --region us-east-1

Expected Response (for update-phone-number):

A successful command returns the full JSON object for the phone number. Confirm the change by checking that the InternationalSendingEnabled value is true

{
    "PhoneNumberArn": "arn:aws:sms-voice:us-east-1:111122223333:phone-number/phone-a1b2c3d4e5f67890",
    "PhoneNumberId": "phone-a1b2c3d4e5f67890",
    "PhoneNumber": "+18005550199",
    "Status": "ACTIVE",
    "IsoCountryCode": "US",
    "MessageType": "TRANSACTIONAL",
    "NumberCapabilities": [
        "SMS"
    ],
    "NumberType": "TOLL_FREE",
    "MonthlyLeasingPrice": "2.00",
    "TwoWayEnabled": true,
    "InternationalSendingEnabled": true,
    "CreatedTimestamp": "2025-08-15T10:30:00.123Z"
}

3. To Verify the Current Status:

Use the describe-phone-numbers command with your Phone Number ID to check its current configuration at any time.

aws pinpoint-sms-voice-v2 describe-phone-numbers \
    --phone-number-ids "phone-a1b2c3d4e5f67890" \
    --region us-east-1

Benefits and Limitations

This feature offers a powerful new way to reach a global audience, but it’s important to understand where it shines and what its limitations are.

Benefits (Advantages)

  • Global Reach with a Single Number: Send SMS to over 150 countries using a single, existing US TFN.
  • Simplified Management: Avoid the operational overhead and cost of purchasing and managing a fleet of country-specific phone numbers.
  • Rapid Prototyping and Testing: Quickly test messaging campaigns in new international markets before committing to the best practice approach of acquiring dedicated in-country numbers.
  • Cost Optimization for One-Way Alerts: Provides a cost-effective method for sending high-volume, one-way transactional messages like OTPs, appointment reminders, and shipping notifications globally.

Limitations & Technical Considerations

  • Two-Way SMS is Limited to the US and Canada: Reliable, two-way SMS conversations are only supported for recipients in the United States and Canada.
  • One-Way Only for All Other Countries: For all other destinations, this is a one-way only.
  • Best-Effort Deliverability: Sending outside of the US and Canada is on a “best-effort” basis. The phone number that appears on the recipient’s device may be replaced with a local number or Sender ID, which is why two-way messaging will not work for these destinations. For more details on maximizing delivery, please read A Guide to Optimizing SMS Delivery and Best Practices.
  • Managed Opt-Out is Not Guaranteed Internationally: The automatic STOP reply functionality does not work for destinations outside of the US and Canada. For international recipients, you must provide an alternative opt-out method.
  • Standard Throughput (3 MPS): International TFNs have a default throughput of 3 Message Parts Per Second (MPS). For high-volume, high-throughput campaigns, dedicated country-specific numbers (like short codes) are the recommended best practice.

Understanding the Cost

The pricing for this feature is straightforward:

  • No Additional Monthly Fees: There is no extra charge to enable the international sending capability on your US TFN. You only pay the standard monthly lease for the number itself.
  • Pay-Per-Use Messaging: You are billed for each outbound SMS message at the standard, per-message rate for the destination country.

For a complete and up-to-date list of prices by country, please visit the AWS End User Messaging Pricing page.

When to Use This vs. Country-Specific Numbers

Choosing the right tool depends on your use case. Here’s a simple comparison:

Considerations and Next Steps

Once you have enabled your international sending over US Toll-Free Numbers, you can enhance your messaging strategy by considering resilience, monitoring, and scalability. The following resources provide best practices for enhancing your sending.

Conclusion

International SMS for US Toll-Free Numbers is a powerful strategic tool for businesses looking to simplify their global messaging. It excels at enabling rapid testing in new markets and efficiently delivering one-way transactional alerts across the globe from a single number.

However, it is not a replacement for the best practice of using dedicated, in-country phone numbers when reliable two-way conversations and guaranteed branding are critical to your campaign’s success. By understanding its benefits and limitations, you can strategically use this feature to get going quickly while planning a long-term move towards country-specific codes for your most important markets.

Amazon Nova Multimodal Embeddings: State-of-the-art embedding model for agentic RAG and semantic search

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-nova-multimodal-embeddings-now-available-in-amazon-bedrock/

Today, we’re introducing Amazon Nova Multimodal Embeddings, a state-of-the-art multimodal embedding model for agentic retrieval-augmented generation (RAG) and semantic search applications, available in Amazon Bedrock. It is the first unified embedding model that supports text, documents, images, video, and audio through a single model to enable crossmodal retrieval with leading accuracy.

Embedding models convert textual, visual, and audio inputs into numerical representations called embeddings. These embeddings capture the meaning of the input in a way that AI systems can compare, search, and analyze, powering use cases such as semantic search and RAG.

Organizations are increasingly seeking solutions to unlock insights from the growing volume of unstructured data that is spread across text, image, document, video, and audio content. For example, an organization might have product images, brochures that contain infographics and text, and user-uploaded video clips. Embedding models are able to unlock value from unstructured data, however traditional models are typically specialized to handle one content type. This limitation drives customers to either build complex crossmodal embedding solutions or restrict themselves to use cases focused on a single content type. The problem also applies to mixed-modality content types such as documents with interleaved text and images or video with visual, audio, and textual elements where existing models struggle to capture crossmodal relationships effectively.

Nova Multimodal Embeddings supports a unified semantic space for text, documents, images, video, and audio for use cases such as crossmodal search across mixed-modality content, searching with a reference image, and retrieving visual documents.

Evaluating Amazon Nova Multimodal Embeddings performance
We evaluated the model on a broad range of benchmarks, and it delivers leading accuracy out-of-the-box as described in the following table.

Amazon Nova Embeddings benchmarks

Nova Multimodal Embeddings supports a context length of up to 8K tokens, text in up to 200 languages, and accepts inputs via synchronous and asynchronous APIs. Additionally, it supports segmentation (also known as “chunking”) to partition long-form text, video, or audio content into manageable segments, generating embeddings for each portion. Lastly, the model offers four output embedding dimensions, trained using Matryoshka Representation Learning (MRL) that enables low-latency end-to-end retrieval with minimal accuracy changes.

Let’s see how the new model can be used in practice.

Using Amazon Nova Multimodal Embeddings
Getting started with Nova Multimodal Embeddings follows the same pattern as other models in Amazon Bedrock. The model accepts text, documents, images, video, or audio as input and returns numerical embeddings that you can use for semantic search, similarity comparison, or RAG.

Here’s a practical example using the AWS SDK for Python (Boto3) that shows how to create embeddings from different content types and store them for later retrieval. For simplicity, I’ll use Amazon S3 Vectors, a cost-optimized storage with native support for storing and querying vectors at any scale, to store and search the embeddings.

Let’s start with the fundamentals: converting text into embeddings. This example shows how to transform a simple text description into a numerical representation that captures its semantic meaning. These embeddings can later be compared with embeddings from documents, images, videos, or audio to find related content.

To make the code easy to follow, I’ll show a section of the script at a time. The full script is included at the end of this walkthrough.

import json
import base64
import time
import boto3

MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"
EMBEDDING_DIMENSION = 3072

# Initialize Amazon Bedrock Runtime client
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

print(f"Generating text embedding with {MODEL_ID} ...")

# Text to embed
text = "Amazon Nova is a multimodal foundation model"

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "text": {"truncationMode": "END", "value": text},
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")

Now we’ll process visual content using the same embedding space using a photo.jpg file in the same folder as the script. This demonstrates the power of multimodality: Nova Multimodal Embeddings is able to capture both textual and visual context into a single embedding that provides enhanced understanding of the document.

Nova Multimodal Embeddings can generate embeddings that are optimized for how they are being used. When indexing for a search or retrieval use case, embeddingPurpose can be set to GENERIC_INDEX. For the query step, embeddingPurpose can be set depending on the type of item to be retrieved. For example, when retrieving documents, embeddingPurpose can be set to DOCUMENT_RETRIEVAL.

# Read and encode image
print(f"Generating image embedding with {MODEL_ID} ...")

with open("photo.jpg", "rb") as f:
    image_bytes = base64.b64encode(f.read()).decode("utf-8")

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "image": {
            "format": "jpeg",
            "source": {"bytes": image_bytes}
        },
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")

To process video content, I use the asynchronous API. That’s a requirement for videos that are larger than 25MB when encoded as Base64. First, I upload a local video to an S3 bucket in the same AWS Region.

aws s3 cp presentation.mp4 s3://my-video-bucket/videos/

This example shows how to extract embeddings from both visual and audio components of a video file. The segmentation feature breaks longer videos into manageable chunks, making it practical to search through hours of content efficiently.

# Initialize Amazon S3 client
s3 = boto3.client("s3", region_name="us-east-1")

print(f"Generating video embedding with {MODEL_ID} ...")

# Amazon S3 URIs
S3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"
S3_EMBEDDING_DESTINATION_URI = "s3://my-embedding-destination-bucket/embeddings-output/"

# Create async embedding job for video with audio
model_input = {
    "taskType": "SEGMENTED_EMBEDDING",
    "segmentedEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "video": {
            "format": "mp4",
            "embeddingMode": "AUDIO_VIDEO_COMBINED",
            "source": {
                "s3Location": {"uri": S3_VIDEO_URI}
            },
            "segmentationConfig": {
                "durationSeconds": 15  # Segment into 15-second chunks
            },
        },
    },
}

response = bedrock_runtime.start_async_invoke(
    modelId=MODEL_ID,
    modelInput=model_input,
    outputDataConfig={
        "s3OutputDataConfig": {
            "s3Uri": S3_EMBEDDING_DESTINATION_URI
        }
    },
)

invocation_arn = response["invocationArn"]
print(f"Async job started: {invocation_arn}")

# Poll until job completes
print("\nPolling for job completion...")
while True:
    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)
    status = job["status"]
    print(f"Status: {status}")

    if status != "InProgress":
        break
    time.sleep(15)

# Check if job completed successfully
if status == "Completed":
    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]
    print(f"\nSuccess! Embeddings at: {output_s3_uri}")

    # Parse S3 URI to get bucket and prefix
    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix
    bucket = s3_uri_parts[0]
    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""

    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl
    # The output_s3_uri already includes the job ID, so just append the filename
    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")

    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")

    # Read and parse JSONL file
    response = s3.get_object(Bucket=bucket, Key=embeddings_key)
    content = response['Body'].read().decode('utf-8')

    embeddings = []
    for line in content.strip().split('\n'):
        if line:
            embeddings.append(json.loads(line))

    print(f"\nFound {len(embeddings)} video segments:")
    for i, segment in enumerate(embeddings):
        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")
        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")
else:
    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")

With our embeddings generated, we need a place to store and search them efficiently. This example demonstrates setting up a vector store using Amazon S3 Vectors, which provides the infrastructure needed for similarity search at scale. Think of this as creating a searchable index where semantically similar content naturally clusters together. When adding an embedding to the index, I use the metadata to specify the original format and the content being indexed.

# Initialize Amazon S3 Vectors client
s3vectors = boto3.client("s3vectors", region_name="us-east-1")

# Configuration
VECTOR_BUCKET = "my-vector-store"
INDEX_NAME = "embeddings"

# Create vector bucket and index (if they don't exist)
try:
    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Vector bucket {VECTOR_BUCKET} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Created vector bucket: {VECTOR_BUCKET}")

try:
    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)
    print(f"Vector index {INDEX_NAME} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_index(
        vectorBucketName=VECTOR_BUCKET,
        indexName=INDEX_NAME,
        dimension=EMBEDDING_DIMENSION,
        dataType="float32",
        distanceMetric="cosine"
    )
    print(f"Created index: {INDEX_NAME}")

texts = [
    "Machine learning on AWS",
    "Amazon Bedrock provides foundation models",
    "S3 Vectors enables semantic search"
]

print(f"\nGenerating embeddings for {len(texts)} texts...")

# Generate embeddings using Amazon Nova for each text
vectors = []
for text in texts:
    response = bedrock_runtime.invoke_model(
        body=json.dumps({
            "taskType": "SINGLE_EMBEDDING",
            "singleEmbeddingParams": {
                "embeddingDimension": EMBEDDING_DIMENSION,
                "text": {"truncationMode": "END", "value": text}
            }
        }),
        modelId=MODEL_ID,
        accept="application/json",
        contentType="application/json"
    )

    response_body = json.loads(response["body"].read())
    embedding = response_body["embeddings"][0]["embedding"]

    vectors.append({
        "key": f"text:{text[:50]}",  # Unique identifier
        "data": {"float32": embedding},
        "metadata": {"type": "text", "content": text}
    })
    print(f"  ✓ Generated embedding for: {text}")

# Add all vectors to store in a single call
s3vectors.put_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    vectors=vectors
)

print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")

This final example demonstrates the capability of searching across different content types with a single query, finding the most similar content regardless of whether it originated from text, images, videos, or audio. The distance scores help you understand how closely related the results are to your original query.

# Text to query
query_text = "foundation models"  

print(f"\nGenerating embeddings for query '{query_text}' ...")

# Generate embeddings
response = bedrock_runtime.invoke_model(
    body=json.dumps({
        "taskType": "SINGLE_EMBEDDING",
        "singleEmbeddingParams": {
            "embeddingPurpose": "GENERIC_RETRIEVAL",
            "embeddingDimension": EMBEDDING_DIMENSION,
            "text": {"truncationMode": "END", "value": query_text}
        }
    }),
    modelId=MODEL_ID,
    accept="application/json",
    contentType="application/json"
)

response_body = json.loads(response["body"].read())
query_embedding = response_body["embeddings"][0]["embedding"]

print(f"Searching for similar embeddings...\n")

# Search for top 5 most similar vectors
response = s3vectors.query_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    queryVector={"float32": query_embedding},
    topK=5,
    returnDistance=True,
    returnMetadata=True
)

# Display results
print(f"Found {len(response['vectors'])} results:\n")
for i, result in enumerate(response["vectors"], 1):
    print(f"{i}. {result['key']}")
    print(f"   Distance: {result['distance']:.4f}")
    if result.get("metadata"):
        print(f"   Metadata: {result['metadata']}")
    print()

Crossmodal search is one of the key advantages of multimodal embeddings. With crossmodal search, you can query with text and find relevant images. You can also search for videos using text descriptions, find audio clips that match certain topics, or discover documents based on their visual and textual content. For your reference, the full script with all previous examples merged together is here:

import json
import base64
import time
import boto3

MODEL_ID = "amazon.nova-2-multimodal-embeddings-v1:0"
EMBEDDING_DIMENSION = 3072

# Initialize Amazon Bedrock Runtime client
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

print(f"Generating text embedding with {MODEL_ID} ...")

# Text to embed
text = "Amazon Nova is a multimodal foundation model"

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "text": {"truncationMode": "END", "value": text},
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")
# Read and encode image
print(f"Generating image embedding with {MODEL_ID} ...")

with open("photo.jpg", "rb") as f:
    image_bytes = base64.b64encode(f.read()).decode("utf-8")

# Create embedding
request_body = {
    "taskType": "SINGLE_EMBEDDING",
    "singleEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "image": {
            "format": "jpeg",
            "source": {"bytes": image_bytes}
        },
    },
}

response = bedrock_runtime.invoke_model(
    body=json.dumps(request_body),
    modelId=MODEL_ID,
    contentType="application/json",
)

# Extract embedding
response_body = json.loads(response["body"].read())
embedding = response_body["embeddings"][0]["embedding"]

print(f"Generated embedding with {len(embedding)} dimensions")
# Initialize Amazon S3 client
s3 = boto3.client("s3", region_name="us-east-1")

print(f"Generating video embedding with {MODEL_ID} ...")

# Amazon S3 URIs
S3_VIDEO_URI = "s3://my-video-bucket/videos/presentation.mp4"

# Amazon S3 output bucket and location
S3_EMBEDDING_DESTINATION_URI = "s3://my-video-bucket/embeddings-output/"

# Create async embedding job for video with audio
model_input = {
    "taskType": "SEGMENTED_EMBEDDING",
    "segmentedEmbeddingParams": {
        "embeddingPurpose": "GENERIC_INDEX",
        "embeddingDimension": EMBEDDING_DIMENSION,
        "video": {
            "format": "mp4",
            "embeddingMode": "AUDIO_VIDEO_COMBINED",
            "source": {
                "s3Location": {"uri": S3_VIDEO_URI}
            },
            "segmentationConfig": {
                "durationSeconds": 15  # Segment into 15-second chunks
            },
        },
    },
}

response = bedrock_runtime.start_async_invoke(
    modelId=MODEL_ID,
    modelInput=model_input,
    outputDataConfig={
        "s3OutputDataConfig": {
            "s3Uri": S3_EMBEDDING_DESTINATION_URI
        }
    },
)

invocation_arn = response["invocationArn"]
print(f"Async job started: {invocation_arn}")

# Poll until job completes
print("\nPolling for job completion...")
while True:
    job = bedrock_runtime.get_async_invoke(invocationArn=invocation_arn)
    status = job["status"]
    print(f"Status: {status}")

    if status != "InProgress":
        break
    time.sleep(15)

# Check if job completed successfully
if status == "Completed":
    output_s3_uri = job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"]
    print(f"\nSuccess! Embeddings at: {output_s3_uri}")

    # Parse S3 URI to get bucket and prefix
    s3_uri_parts = output_s3_uri[5:].split("/", 1)  # Remove "s3://" prefix
    bucket = s3_uri_parts[0]
    prefix = s3_uri_parts[1] if len(s3_uri_parts) > 1 else ""

    # AUDIO_VIDEO_COMBINED mode outputs to embedding-audio-video.jsonl
    # The output_s3_uri already includes the job ID, so just append the filename
    embeddings_key = f"{prefix}/embedding-audio-video.jsonl".lstrip("/")

    print(f"Reading embeddings from: s3://{bucket}/{embeddings_key}")

    # Read and parse JSONL file
    response = s3.get_object(Bucket=bucket, Key=embeddings_key)
    content = response['Body'].read().decode('utf-8')

    embeddings = []
    for line in content.strip().split('\n'):
        if line:
            embeddings.append(json.loads(line))

    print(f"\nFound {len(embeddings)} video segments:")
    for i, segment in enumerate(embeddings):
        print(f"  Segment {i}: {segment.get('startTime', 0):.1f}s - {segment.get('endTime', 0):.1f}s")
        print(f"    Embedding dimension: {len(segment.get('embedding', []))}")
else:
    print(f"\nJob failed: {job.get('failureMessage', 'Unknown error')}")
# Initialize Amazon S3 Vectors client
s3vectors = boto3.client("s3vectors", region_name="us-east-1")

# Configuration
VECTOR_BUCKET = "my-vector-store"
INDEX_NAME = "embeddings"

# Create vector bucket and index (if they don't exist)
try:
    s3vectors.get_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Vector bucket {VECTOR_BUCKET} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_vector_bucket(vectorBucketName=VECTOR_BUCKET)
    print(f"Created vector bucket: {VECTOR_BUCKET}")

try:
    s3vectors.get_index(vectorBucketName=VECTOR_BUCKET, indexName=INDEX_NAME)
    print(f"Vector index {INDEX_NAME} already exists")
except s3vectors.exceptions.NotFoundException:
    s3vectors.create_index(
        vectorBucketName=VECTOR_BUCKET,
        indexName=INDEX_NAME,
        dimension=EMBEDDING_DIMENSION,
        dataType="float32",
        distanceMetric="cosine"
    )
    print(f"Created index: {INDEX_NAME}")

texts = [
    "Machine learning on AWS",
    "Amazon Bedrock provides foundation models",
    "S3 Vectors enables semantic search"
]

print(f"\nGenerating embeddings for {len(texts)} texts...")

# Generate embeddings using Amazon Nova for each text
vectors = []
for text in texts:
    response = bedrock_runtime.invoke_model(
        body=json.dumps({
            "taskType": "SINGLE_EMBEDDING",
            "singleEmbeddingParams": {
                "embeddingPurpose": "GENERIC_INDEX",
                "embeddingDimension": EMBEDDING_DIMENSION,
                "text": {"truncationMode": "END", "value": text}
            }
        }),
        modelId=MODEL_ID,
        accept="application/json",
        contentType="application/json"
    )

    response_body = json.loads(response["body"].read())
    embedding = response_body["embeddings"][0]["embedding"]

    vectors.append({
        "key": f"text:{text[:50]}",  # Unique identifier
        "data": {"float32": embedding},
        "metadata": {"type": "text", "content": text}
    })
    print(f"  ✓ Generated embedding for: {text}")

# Add all vectors to store in a single call
s3vectors.put_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    vectors=vectors
)

print(f"\nSuccessfully added {len(vectors)} vectors to the store in one put_vectors call!")
# Text to query
query_text = "foundation models"  

print(f"\nGenerating embeddings for query '{query_text}' ...")

# Generate embeddings
response = bedrock_runtime.invoke_model(
    body=json.dumps({
        "taskType": "SINGLE_EMBEDDING",
        "singleEmbeddingParams": {
            "embeddingPurpose": "GENERIC_RETRIEVAL",
            "embeddingDimension": EMBEDDING_DIMENSION,
            "text": {"truncationMode": "END", "value": query_text}
        }
    }),
    modelId=MODEL_ID,
    accept="application/json",
    contentType="application/json"
)

response_body = json.loads(response["body"].read())
query_embedding = response_body["embeddings"][0]["embedding"]

print(f"Searching for similar embeddings...\n")

# Search for top 5 most similar vectors
response = s3vectors.query_vectors(
    vectorBucketName=VECTOR_BUCKET,
    indexName=INDEX_NAME,
    queryVector={"float32": query_embedding},
    topK=5,
    returnDistance=True,
    returnMetadata=True
)

# Display results
print(f"Found {len(response['vectors'])} results:\n")
for i, result in enumerate(response["vectors"], 1):
    print(f"{i}. {result['key']}")
    print(f"   Distance: {result['distance']:.4f}")
    if result.get("metadata"):
        print(f"   Metadata: {result['metadata']}")
    print()

For production applications, embeddings can be stored in any vector database. Amazon OpenSearch Service offers native integration with Nova Multimodal Embeddings at launch, making it straightforward to build scalable search applications. As shown in the examples before, Amazon S3 Vectors provides a simple way to store and query embeddings with your application data.

Things to know
Nova Multimodal Embeddings offers four output dimension options: 3,072, 1,024, 384, and 256. Larger dimensions provide more detailed representations but require more storage and computation. Smaller dimensions offer a practical balance between retrieval performance and resource efficiency. This flexibility helps you optimize for your specific application and cost requirements.

The model handles substantial context lengths. For text inputs, it can process up to 8,192 tokens at once. Video and audio inputs support segments of up to 30 seconds, and the model can segment longer files. This segmentation capability is particularly useful when working with large media files—the model splits them into manageable pieces and creates embeddings for each segment.

The model includes responsible AI features built into Amazon Bedrock. Content submitted for embedding goes through Amazon Bedrock content safety filters, and the model includes fairness measures to reduce bias.

As described in the code examples, the model can be invoked through both synchronous and asynchronous APIs. The synchronous API works well for real-time applications where you need immediate responses, such as processing user queries in a search interface. The asynchronous API handles latency insensitive workloads more efficiently, making it suitable for processing large content such as videos.

Availability and pricing
Amazon Nova Multimodal Embeddings is available today in Amazon Bedrock in the US East (N. Virginia) AWS Region. For detailed pricing information, visit the Amazon Bedrock pricing page.

To learn more, see the Amazon Nova User Guide for comprehensive documentation and the Amazon Nova model cookbook on GitHub for practical code examples.

If you’re using an AI–powered assistant for software development such as Amazon Q Developer or Kiro, you can set up the AWS API MCP Server to help the AI assistants interact with AWS services and resources and the AWS Knowledge MCP Server to provide up-to-date documentation, code samples, knowledge about the regional availability of AWS APIs and CloudFormation resources.

Start building multimodal AI-powered applications with Nova Multimodal Embeddings today, and share your feedback through AWS re:Post for Amazon Bedrock or your usual AWS Support contacts.

Danilo

Automate the Creation & Rotation of Amazon Simple Email Service SMTP Credentials

Post Syndicated from Zip Zieper original https://aws.amazon.com/blogs/messaging-and-targeting/automate-the-creation-rotation-of-amazon-simple-email-service-smtp-credentials/

[Amazon Simple Email Service] provides a secure email solution that scales with your business needs. Unfortunately, all email systems, including Amazon SES, remain the primary target for spammers and bad actors due to email’s widespread use and accessibility.

While SES offers powerful features for application-based email sending, its SMTP credentials require careful management to prevent unauthorized access. Compromised credentials enable bad actors to send malicious emails through legitimate domains, which can bypass security filters and damage sender reputation.

To protect your SES implementation, you must encrypt SMTP credentials during storage and transmission. Additionally, implementing role-based access controls helps restrict credential access to authorized personnel only. Regular credential rotation at fixed intervals, typically every 90 days, minimizes potential security breaches. Automating this rotation process eliminates human error and ensures consistent security practices across your organization.

Problem Statement

Imagine you are the administrator for a large financial institution. You recently began using Amazon SES to send email from two dozen on-premises servers. Your email servers authenticate with SES using SMTP credentials to access the SES SMTP interface. Your organization’s security policies mandate regular credential rotation, including the ability to rotate them on-demand. How can you automate SMTP credential rotation such that you can meet your organization’s security policies?

This blog post will present two solutions that automate the secure management and automatic rotation of SMTP credentials for Amazon SES. Each will help enhance email security, comply with regulations, and minimize operational overhead.

Both solutions provide SES customers who use SMTP with additional tools to improve email security, ensure compliance, and reduce operational overhead. You can deploy the option that best suites your needs by following the guidance in this blog post.

If your environment supports automated rotation, AWS Systems Manager Documents (SSM Documents) can help by providing pre-defined or custom automation workflows for securely managing secrets rotation, deploy Option 1.

If your environment does not support automated rotation, you can still implement an auditable, managed rotation solution by storing your secrets in AWS Systems Manager Parameter Store by deploying Option 2.

As a pay-per-use platform, the underlying AWS services used in either deployment option will only charge you for the resources that you actually consume. You can leverage the AWS Pricing Calculator to estimate the run-time costs for your specific workload. Alternatively, you can work directly with your AWS account team to understand the pricing for these solutions.

Getting SES SMTP Credentials

To send emails through the Amazon SES SMTP interface, email servers must first authenticate with SES using dedicated SES SMTP credentials. Typically, a systems administrator logs into the AWS SES console, clicks the Create SMTP Credentials button, and navigates to the AWS Identity and Access Management] (IAM) console. There, the administrator creates an IAM user with permissions for SES. The administrator then uses the IAM user’s secret access key to generate the SES SMTP password, which they use to configure their email servers or SMTP-enabled applications for use with SES.

Multiple SMTP Credentials

The SES SMTP interface authenticates requests using an SMTP credential derived from an IAM user’s access key ID and secret access key. Since temporary access keys cannot be used to derive SES SMTP credentials, you must deploy and regularly rotate a long-lived key.

While the manual process of creating SES SMTP credentials works for a small number of credentials, it becomes cumbersome for customers with numerous email servers or strict password rotation policies. These customers may find the automated credential rotation mechanisms described in the following solutions better suited to their production needs.

Option 1 – Fully Automated Credential Rotation:

The fully automated version of this solution uses a custom Lambda function to create an SMTP password, which is stored in AWS Secrets Manager. AWS Secrets Manager’s built-in rotation feature then triggers the rotation of SES SMTP credentials. AWS Systems Manager Documents utilize AWS Systems Manager Agents to automatically make the changes to the authentication configuration on email servers.

The key advantages of using AWS Systems Manager to make the email server configuration changes include:

  • Ability to deploy changes to on-premises and Amazon EC2 hosts, allowing rotation of secrets across a hybrid estate.
    Customization of the document to specific email software configurations.
    Targeting the secret (SMTP credential) rotation document on all email servers based on tags.

Let’s dive deep into Option 1 – Fully Automated Credential Rotation.

Option 1 - Fully Automated Credential Rotation

How Option 1 works:

Refer to the image above for the workflow:

  1. AWS Secrets Manager initiates a rotation request, either on a schedule or via an authorized user’s request, triggering the “rotation Lambda” to rotate the SES SMTP credentials.
  2. The SES Secret Rotation Function Lambda (see figure x above):
    • a. Creates a new IAM secret access key for the designated SES IAM user, derives the new SES SMTP password, and stores it in AWS Secrets Manager.
    • b. Connects to SES to verify the new SMTP password can authenticate.
    • c. Initiates an AWS Systems Manager Run Command to update the new SMTP password on target email servers using a pre-configured Systems Manager Document.
    • d. (and e.) Monitors the status of the Systems Manager Document execution until all updates complete successfully
    • f. Deletes the old IAM access and secret access keys.

With this fully automated solution, SES SMTP credentials can be rotated on a schedule or triggered manually, with no impact to email service uptime.

Deploying the Fully Automated Solution in Your AWS Account (Option 1)

Prerequisites for the Fully Automated Solution

  1. AWS Account Access, typically with admin-level permission to allow for the deployment.
  2. Your preferred IDE with AWS CLI version 2 and named profile setup.
    • Alternatively, you can use the AWS CLI from the AWS CloudShell in your browser.
  3. Clone the Github repository (for this solution, you only need the README.md and sesautomaticrotation.yaml files found in /ses-credential-rotation/automatic-rotation).
    • git clone -b ses-credential-rotation https://github.com/aws-samples/serverless-mail.git
    • Note – We follow the principles of least privilege in this solution. The CloudFormation templates we’ve supplied require you to specify an identity, or configuration-set resource to use in the SES sending operation. You can find guidance on defining these values at Actions, resources, and condition keys for Amazon SES. Additionally, we’ve limited the IAM User to the ses:SendRawEmail action, which you can adjust as appropriate).
  4. Console access to your AWS SES account that is properly configured to send emails via at least one verified identity.
  5. Target email server(s) properly configured to send email via SES using SES SMTP authentication.
    • The AWS Systems Manager agent(s) must be correctly installed and configured on your target email server(s) as detailed in Setting up AWS Systems Manager.
    • The target email servers must be decorated with the tag (“SSMServerTag“) and value (“SSMServerTagValue“). These values allow the Systems Manager Document to identify them.
      • We use the tag “EmailServer” and the value “True” in our example, but you can use any tag and value that you wish).
  6. An email address (or list) to receive SMTP rotation notifications.
  7. Console access to your AWS Secrets Manager.
  8. Console access to your AWS Systems Manager.

Deployment Steps

  1. Clone the GitHub repository to your IDE
    • If using AWS CloudShell, ensure you are in the same region as your AWS Systems and Secrets Manager
    • run: git clone https://github.com/aws-samples/serverless-mail.git
    • Navigate to the directory ses-credential-rotation/automatic-rotation
  2. Follow the steps in README.md to
    • Create a S3 bucket to deploy the CloudFormation Template.
    • Package the Lambda functions and upload them to Amazon S3.
    • Deploy the Cloud Formation Template.
    • Update the appropriate AWS Systems Manager sample document created by the CloudFormation Template to reflect your email server environments. These can be found in the AWS Systems Manager console under Documents > Owned by me
      • The ExampleWindowsIISSMTPSESpasswordrotator sample provides an example for Microsoft Windows hosts using the runPowerShellScript action to update the server’s SMTP credentials.
      • The ExamplePostfixSESpasswordrotator sample provides an example for Linux hosts using the runShellScript action to update the server’s SMTP credentials.

Testing Option 1 – Fully Automated Credential Rotation

To test the Fully Automated Credential Rotation solution, have Secrets Manager perform an immediate rotation by following these steps:

  1. Open AWS Secrets Manager console
  2. Locate the secret SESSendSecret
  3. Select the Rotation tab
  4. Click the “Rotate Secret immediately” button.

You can track the progress of the rotation by locating the logs of the Lambda that is deployed to manage the rotation.

  1. In the AWS console, go to CloudFormationStack’s Resources tab
  2. Find the LogicalID = SESSecretRotationFunction
  3. Click the PhysicalID link to open the Lambda
  4. Under the Monitor Tab, select the “View CloudWatch logs” button in the top right
  5. The logs should show the rotation flow through 4 stages below (more details of each stage are available here):
    1. create_secret
    2. set_secret
    3. test_secret
    4. finish_secret

Option 2 – Partially Automated Credential Rotation:

The partially automated version uses a custom AWS Lambda function to create an SMTP password, which is stored in AWS Systems Manager Parameter Store. This solution simplifies credential rotation, where manual changes must be conducted by support staff. By wrapping the manual change process with AWS Step Functions, you can ensure a robust and auditable process to regularly rotate the SES SMTP credential.

How Option 2 works:

  1. The credential rotation AWS Step Function creates a new SES SMTP credential and updates it in AWS Systems Manager Parameter Store.
  2. It retrieves a list of servers from an Amazon DynamoDB table and launches a manual confirmation AWS Step Function execution for each server to initiate and track the manual step.
  3. The manual confirmation AWS Step Function emails the designated address, requesting support staff to arrange the rotation. The email includes a link specific to that server.
  4. The person completing the manual change confirms back to the AWS Step Function via the link that the rotation is complete.
  5. Once the rotation on a server is confirmed, the manual confirmation AWS Step Function for that server is marked as complete.
  6. After all server rotations are complete, the credential rotation AWS Step Function continues, disabling the old SES SMTP credential and deleting it after a few days.

AWS Step Function executions can last up to 365 days, providing sufficient time for the manual rotation and confirmation.

The screenshot below shows a graphical representation of the credential rotation AWS Step Function execution status, providing a real-time view of the rotation progress.

SMTP credential rotation AWS Step Function

You can also track the status of individual servers via the manual rotation step function execution list.

SMTP manual rotation step function execution list

The partially automated solution for rotating Amazon SES SMTP credentials is illustrated and detailed below:

Option 2 - partially automated solution

Refer to the image above for the option 2 workflow:

  1. EventBridge Scheduler Trigger: An EventBridge scheduler rule triggers a custom Starter Function Lambda (SF Lambda) on the last day of every 3rd month (this can be adjusted to suit your needs in the CloudFormation template).
  2. Credential Rotation Step Function: The Starter Function Lambda triggers the Credential Rotation AWS Step Function, providing a clearly defined name to facilitate auditing (“password-rotation-dd-mm-yy“).
  3. Credential Rotation Step Function Actions:
    1. Creates a new IAM (Identity and Access Management) secret access key for the SES IAM user.
    2. Triggers the SMTP Credential Generator Lambda to derive the SES SMTP password from the newly created IAM secret access key (using the algorithm provided in the SES documentation.
    3. Stores the new SES SMTP credential in AWS Systems Manager Parameter Store.
    4. Reads a list of servers that are utilizing this credential from a DynamoDB table.
  4. Manual Confirmation Step Function:
    1. For each server, a manual confirmation AWS Step Function is triggered, sending a message on the Amazon Simple Notification Service (SNS) topic.
    2. The SNS notification prompts the server administrator via email to manually rotate the SMTP credentials on the on-premises email server.
    3. The server administrator uses a link in the email to confirm the credential has been rotated and tested on the server.
    4. The link triggers the Confirmation Lambda exposed via API Gateway, which marks the ManualConfirmation step function as complete.
  5. Credential Rotation Completion: The CredentialRotation step function waits until all manual confirmation step functions have completed before proceeding.
  6. Old IAM Access Key Deletion: Once confirmation has been received for all servers, the step function deletes the old IAM access key.

Deployment

To deploy the partially automated solution in your AWS account, you will need the following prerequisites:

Prerequisites for the Partially Automated Solution

  1. AWS Account Access, typically with admin-level permission to allow for the deployment.
  2. Your preferred IDE with AWS CLI version 2 and named profile setup. Alternatively, you can use the AWS CLI from the AWS CloudShell in your browser.
  3. SES enabled, configured, and properly sending emails.
  4. External email server(s) currently configured to use SES with SMTP.
  5. Administrator email address to receive notifications.
  6. AWS Secrets Manager and AWS Systems Manager set up.
  7. AWS Systems Manager agent(s) correctly installed and configured on your target email servers as detailed in Setting up AWS Systems Manager.
  8. Amazon EC2 instance with Postfix configured to send emails through SES
  9. Target email servers must be decorated with a tag (“SSMServerTag“) and value (“SSMServerTagValue“) that will be used to identify them by the Systems Manager Document (we used “server” and “email”)
  10. AWS Parameter Store and AWS Step Functions.

Once you have the prerequisites in place, follow the instructions in the GitHub project.

Conclusion

Implementing an automated credential rotation process for Amazon SES SMTP enhances security and compliance, streamlines operations, and reduces the risk of downtime and human error. By leveraging AWS Secrets Manager and AWS Systems Manager (option 1) or AWS Systems Manager Parameter Store and Step Functions (option 2), organizations can centralize SES SMTP credential management, maintain an audit trail, and quickly update email application servers with new SMTP credentials.

Need additional guidance?

Instant Well-Architected CDK Resources with Solutions Constructs Factories

Post Syndicated from Biff Gaut original https://aws.amazon.com/blogs/devops/instant-well-architected-cdk-resources-with-solutions-constructs-factories/

For several years, AWS Solutions Constructs have helped thousands of AWS Cloud Development Kit (CDK) users accelerate their creation of well-architected workloads by providing small, composable patterns linking two or more AWS services, such as an Amazon S3 bucket triggering an AWS Lambda function. Over this time, customers with use cases that don’t match an existing Solutions Construct have expressed a desire to create individual well-architected resources directly. Solutions Constructs Factories allow clients to create well-architected individual resources using the same internal code that Solutions Constructs use when composing larger patterns. While deploying a single AWS resource using the AWS CDK is often a trivial task, deploying that resource following all the best practices requires more knowledge and effort. For instance, a properly configured S3 bucket should include versioning, encryption, access logging, bucket policies allowing only TLS calls, and lifecycle policies. The AWS Solutions Constructs S3BucketFactory()method implements a fully well-architected CDK S3 bucket with all the best practices configured, including an additional bucket to hold S3 Access Logs. At the same time, every aspect of each resource can be overridden to satisfy any unique requirements for your use case. Best of all, the resources created are all standard CDK L2 constructs that integrate with any CDK stack.

Solutions Constructs Factories are a new feature of AWS Solutions Constructs, a library of common architectural patterns built on top of the AWS CDK. These multi-service patterns allow you to deploy multiple resources with a single CDK Construct. Solutions Constructs follow best practices by default – both for the configuration of the individual resources as well as their interaction. While each Solutions Construct implements a very small architectural pattern, they are designed so that multiple constructs can be combined by sharing a common resource. For instance, a Solutions Construct that implements an S3 bucket invoking a Lambda function can be deployed along with a second Solutions Construct that deploys a Lambda function that writes to an Amazon SQS queue by sharing the same Lambda function between the two constructs. There are currently over 70 Solutions Constructs available, covering Amazon API Gateway, Amazon Simple Storage Service (S3), Amazon SNS, Amazon Simple Queue Service (SQS), AWS Lambda, AWS Secrets Manager, IoT, Amazon ElasticCache, AWS Step Functions, AWS Fargate, AWS WAF, Application Load Balancers, Amazon Kinesis, Amazon Data Firehose, Amazon Sagemaker, AWS Elemental Mediastore, and more.

The AWS CDK provides a programming model above the static AWS CloudFormation template, representing all AWS resources with instantiated objects in a high-level programming language. When you instantiate CDK objects in your Typescript (or other language) code, the CDK “compiles” those objects into a JSON template, then deploys that template with CloudFormation. The use of programming languages such as Typescript or Python rather than the declarative YAML or JSON allows much more flexibility in defining your infrastructure. If you are not yet familiar with the AWS CDK you should check it out.

Visual representation of how AWS Solutions Constructs build abstractions upon the AWS CDK, which is then compiled into static CloudFormation templates. Solutions Constructs Factories are a feature found within AWS Solutions Constructs

Infrastructure as Code Abstraction Layers

How Constructs Factories Work

This demo will guide you through building a small CDK stack that uses one of the new Solutions Constructs Factories. The app will use a factory to create an S3 bucket that fully implements all the recommended best practices with a single function call.

First, create a Typescript CDK app and install the Solutions Constructs libraries:

md factories-blog
cd factories-blog
cdk init -l=typescript
npm install @aws-solutions-constructs/aws-constructs-factories

Next, open lib/factories-blog-stack.ts, delete the comments and add the indicated new lines of code:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
// Add this import statement:
import { ConstructsFactories } from '@aws-solutions-constructs/aws-constructs-factories';

export class FactoriesBlogStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Add these two lines
    const factories = new ConstructsFactories(this, 'constructs-factories');
    const response = factories.s3BucketFactory('default-bucket', {});
  }
}

Now build and deploy the app:

npm run build
cdk deploy

A key consideration for Constructs Factories is that the resources they create integrate seamlessly with the rest of your CDK stack. That’s why factories are implemented as methods that return standard AWS CDK L2 constructs rather than L3 constructs. Instantiating the ConstructsFactories object gives you access to the factory methods, but on its own adds nothing to your stack. In this case you called s3BucketFactory(), passing two arguments: a string id upon which all resource names will be based, and an empty S3BucketFactoryProps object. The method added 4 resources to your CDK stack. You can explore everything this app just deployed either in the console or through the synthesized template in ./cdk.out. Here’s a synopsis of what you’ll find:

  • An S3 bucket for storage (the goal of the call), with the following configuration:
    • AES-256 server side encryption
    • All public ACLs and policies blocked
    • Versioning enabled
    • A bucket policy blocking all access not via aws:SecureTransport (TSL)
    • Access logging enabled to a separate bucket
    • A lifecycle rule transitioning non-current versions to Glacier after 90 days
  • A second S3 bucket to contain the access logs, with a similar configuration:
    • AES-256 server side encryption
    • All public ACLs and policies blocked
    • Versioning enabled
    • A bucket policy blocking all access not via aws:SecureTransport (TSL)

A single call has deployed two resources with all best practices configured by default! As most workloads have some unique requirements, all of these resources can be customized to integrate into the overall stack. Diving a little deeper, recall the empty S3BucketFactoryProps object you passed to the factory call. The S3BucketFactoryProps argument allows you to send any properties to override the default settings for the main bucket and the logging bucket. For instance, if you need the bucket to send notifications to EventBridge you can enable that through the properties:

const response = factories.s3BucketFactory('customized-bucket', {
  bucketProps: {
    eventBridgeEnabled:true
  }
});

Second, the factory call returned an S3FactoryResponse object, which exposes standard CDK L2 Bucket constructs for both the main and logging buckets (you can also access the new BucketPolicies through these Bucket constructs).

const arn = response.s3Bucket.bucketArn;

Available Factories

The first release of Solutions Constructs Factories focuses on services where configuring a resource according to best practices requires the launch of additional resources, or other additional complexity. We currently support 3 different resources:

S3 Buckets

The s3BucketFactory() function will create an S3 bucket with:

  • TLS access required
  • Access logging enabled (to an additional log bucket by default)
  • Versioning enabled
  • All public ACLs and policies blocked
  • AWS managed server side encryption
  • Lifecycle policies that transition non-current versions to S3 Glacier after 90 days

Step Functions State Machines

The stateMachineFactory() function will create a Step Functions state machine with:

  • CloudWatch logs names prefixed with /aws/vendedlogs/ and unique to your stack, avoiding any issues with resource policy size without risk of name collisions in your account.
  • CloudWatch alarms for:
    • 1 or more failed executions
    • 1 or more throttled executions
    • 1 or more aborted executions

SQS Queues

The sqsQueueFactory() function will create an SQS queue with:

  • KMS managed encryption
  • A DLQ configured to accept messages that can’t be processed
  • A resource policy ensuring that only the queue owner can perform operations on the queue

Conclusion

A single call to an AWS Solutions Constructs Factory deploys a resource with all best practice settings, as well as any additional infrastructure required to make the resource part of a well-architected application. Since the factories return standard AWS CDK L2 constructs, you can use them in any new or existing CDK stack. The first Solutions Constructs Factories were launched earlier this year and more will be added soon. If you have factories you’d like to see implemented, enter an Issue on the Solutions Constructs github page. The next time you launch any of these three resources in a stack, try using a Constructs factory – you may never directly instantiate a CDK construct again.

New: Zone Groups for Availability Zones in AWS Regions

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/new-zone-groups-for-availability-zones-in-aws-regions/

This blog post is written by Pranav Chachra, Principal Product Manager, AWS.

In 2019, AWS introduced Zone Groups for AWS Local Zones. Today, we’re announcing that we are working on extending the Zone Group construct to Availability Zones (AZs).

Zone Groups were launched to help users of AWS Local Zones identify related groups of Local Zones that reside in the same geography. For example, the two interconnected Local Zones in Los Angeles (us-west-2-lax-1a and us-west-2-lax-1b) make up the us-west-2-lax-1 Zone Group. These Zone Groups are used for opting in to the Local Zones, as shown in the following image.

In October 2024, we will extend the Zone Group construct to AZs with a consistent naming format across all AWS Regions. This update will help you differentiate group of Local Zones and AZs based on their unique GroupName, improving manageability and clarity. For example, the Zone Group of AZs in the US West (Oregon) Region will now be referred to as us-west-2-zg-1, where us-west-2 indicates the Region, and zg-1 indicates it is the group of AZs in the Region. This new identifier (such as us-west-2-zg-1) will replace the current naming (such as us-west-2) for the GroupName available through the DescribeAvailabilityZones API. The names for the Local Zones Groups (such as us-west-2-lax-1) will remain the same as before.

For example, when you list your AZs, you see the current naming (such as us-west-2) for the GroupName of AZs in the US West (Oregon) Region:

The new identifier (such as us-west-2-zg-1) will replace the current naming for the GroupName of AZs, as shown in the following image.

At AWS we remain committed to responding to customer feedback and we continuously improve our services based upon that feedback. If you have questions or need further assistance, contact AWS Support on the community forums and through AWS Support.

Implementing Identity-Aware Sessions with Amazon Q Developer

Post Syndicated from Ryan Yanchuleff original https://aws.amazon.com/blogs/devops/implementing-identity-aware-sessions-with-amazon-q-developer/

“Be yourself; everyone else is already taken.”
-Oscar Wilde

In the real world as in the world of technology and authentication, the ability to understand who we are is important on many levels. In this blog post, we’ll look at how the ability to uniquely identify ourselves in the AWS console can lead to a better overall experience, particularly when using Amazon Q Developer. We explore the features that become available to us when Q Developer can uniquely identify our sessions in the console and match them with our subscriptions and resources. We’ll look at how we can accomplish this goal using identity-aware sessions, a capability now available with AWS IAM Identity Center. Finally, we’ll walk through the steps necessary to enable it in your AWS Organization today.

Amazon Q Developer is a generative AI-powered assistant for software development. Accessible from multiple contexts including the IDE, the command line, and the AWS Management Console, the service offers two different pricing tiers: free and Pro. In this post, we’ll explore how to use Q Developer Pro in the AWS Console with identity-aware sessions. We’ll also explore the recently introduced ability to chat about AWS account resources within the Q Developer Chat window in the AWS Console to inquire about resources in an AWS account when identity-aware sessions are enabled.

Connecting your corporate source of identities to IAM Identity Center creates a shared record of your workforce and users’ group associations. This allows AWS applications to interact with one another efficiently on behalf of your users because they all reference this shared record of attributes. As a result, users have a consistent, continuous experience across AWS applications. Once your source of identities is connected to IAM Identity Center, your identity provider administrator can decide which users and groups will be synchronized with Identity Center. Your Amazon Q Developer administrator sees these synchronized users and groups only within Amazon Q Developer and can assign Q Developer Pro subscriptions to them.

User Identity in the AWS Console

To access the AWS Console, you must first obtain an IAM session – most commonly by using Identity Center Access Portal, IAM federation, or IAM (or root) users. Users can also use IAM Identity Center or a third party federated login mechanism. In this post, we’ll be using Microsoft Entra ID, but many other providers are available. Of all these options, however, only logging in with IAM Identity Center provides us with enough context to uniquely identity the user automatically by default. Identity-aware sessions will make this work.

Using Q Developer with a direct IAM Identity Center connection

Figure 1: Logging into an AWS account via the IAM Identity Center enables Q Developer to match the user with an active Pro subscription.

To meet customers where they are and allow them to build on their existing configurations, IAM Identity Center includes a mechanism that allows users to obtain an identity-aware session to access Q in the Console, regardless of how they originally logged in to the Console.

Let’s look at a real-world example to explore how this might work. Let’s assume our organization is currently using Microsoft Entra ID alongside AWS Organizations to federate our users into AWS accounts. This grants them access to the AWS console for accounts in our AWS Organization and enables our users to be assigned IAM roles and permissions. While secure, this access method does not allow Q Developer to easily associate the user with their Entra ID identity and to match them to a Q Developer subscription.

Using Q Developer with a 3P Identity Provider and IAM Identity Center

Figure 2: Using Entra ID, the user is federated into the AWS account and assumes an IAM role without further context in the console. Q Developer can obtain that context by authenticating the user with identity-aware sessions. This process is first attempted manually before prompting the user for credentials

To provide identity-aware sessions to these users, we can enable IAM Identity Center for the Organization and integrate it with our Entra ID instance. This allows us to sync our users and groups from Entra ID and assign them to subscriptions in our AWS Applications such as Amazon Q Developer.

We then go one step further and enable identity-aware sessions for our Identity Center instance. Identity-aware sessions allow Amazon Q to access user’s unique identifier in Identity Center so that it can then look up a user’s subscription and chat history. When the user opens the Console Chat, Q Developer checks whether the current IAM session already includes a valid identity-aware context. If this context is not available, Q will then verify the account is part of an Organization and has an IAM Identity Center instance with identity-aware sessions enabled. If so, it will prompt the user to authenticate with IAM Identity Center. Otherwise, the chat will throw an error.

With a valid Q Developer Pro subscription now verified, the user’s interactions with the Q Chat window will include personalization such as access to prior chat history, the ability to chat about AWS account resources, and higher request limits for multiple capabilities included with Q Developer Pro. This will persist with the user for the duration of their AWS Console session.

Configuring Identity-Aware Sessions

Identity-aware sessions are only available for instances of IAM Identity Center deployed at the AWS Organization level. (Account-level instances of IAM Identity Center do not support this feature). Once IAM Identity Center is configured, the option to enable Identity-aware sessions needs to be manually selected. (NOTE: This is a one-way door option which, once enabled, cannot be disabled. For more information about prerequisites and considerations for this feature, you can review the documentation here.)

To begin, verify that you have enabled AWS Organizations across your accounts. Once you have completed this, you are ready to enable IAM Identity Center and enable identity-aware sessions. The steps below should be completed by a member of your infrastructure administration team.

For customers who already have an Organization-based instance of IAM Identity Center configured, skip to Step 4 below. For those organizations who would like to read more about IAM Identity Center before completing the following steps, you can find details in the documentation available here.

Walkthrough

  1. From within the management account or security account configured in your AWS Configuration, access the AWS Console and navigate to the AWS IAM Identity Center in the region where you wish to deploy your organization’s Identity Center instance.
  2. Choose the “Enable” option where you will be presented with an option to setup Identity Center at the Organization level or as a single account instance. Choose the “Enable with AWS Organizations” to have access to identity-aware sessions.

Choose the IAM Identity Center Context - Organization vs Account

  1. After Identity Center has been enabled, navigate to the “Settings” page from the left-hand navigation menu. Note that under the “Details” section, the “Identity-aware sessions” option is currently marked as “Disabled”.

IAM Identity Center General Settings - Identity-Aware Sessions disabled by default

  1. Choose the “Enable” option from the Details section or select it from the blue prompt below the Details section.

Identity-Aware Info Prompt allows users to enable

  1. Choose “Enable” from the popup box that appears to confirm your choice.

Identity-Aware Confirmation Prompt

  1. Once IAM Identity Center is enabled and Identity-aware sessions are enabled, you can then proceed by either creating a user manually in Identity Center to log in with, or by connecting your Identity Center instance to a third-party provider like Entra ID, Ping, or Okta. For more information on how to complete this process, please see the documentation for the various third-party providers available.
  2. If you don’t have Q Developer enabled, you will want to do so now. From within the AWS Console, using the search bar navigate to the Amazon Q Developer service. As a best practice, we recommend configuring Q Developer in your management account.

Amazon Q Developer Home Screen - Control panel to add subscriptions

  1. Begin by clicking the “Subscribe to Amazon Q” button to enable Q Developer in your account. You will see a green check denoting that Q has successfully been paired with IAM Identity Center.

Amazon Q Developer Paired with IAM Identity Center - Info Notice

  1. Choose “Subscribe” to enable Q Developer Pro.

Amazon Q Developer Pro Subscription Info Panel

  1. Enable Q Developer Pro in the popup prompt

Amazon Q Developer Pro Confirmation

  1. From here, you can then assign users and groups from the Q Developer prompt or you may assign them from within the IAM Identity Center using the Amazon Q Application Profile.

IAM Identity Center - Q Developer Profile for Managed Applications

  1. Once your users and groups have been assigned, they are now able to begin using Q Developer in both the AWS account console and their individual IDE’s.

Why Use Q Developer Pro?

In this final section, we’ll explore the benefits of using Amazon Q Developer Pro. There are three main areas of benefit:

Chat History

Q Developer Pro can store your chat history and restore it from previous sessions each time you begin. This enables you to develop a context within the chat about things that are relevant to your interests and in turn inform the feedback you receive from Q going forward.

Chat about your AWS account resources

Q Developer Pro can leverage your IAM permissions to make requests regarding resources and costs associated with your account (assuming you have the appropriate policies). This enables you to inquire about certain resources deployed in a given region, or ask questions about cost such as the overall EC2 spend in a given period of time.

Sample Q Developer Chat Question

Figure 4: From the Q Chat panel, you can inquire about resources deployed in your account. (This capability requires you to have the necessary permissions to view information about the requested resource.)

Personalization

Identity-aware sessions also enable you to benefit from custom settings in your Q Chat. For example, you can enable cross-region access for your Q Chat sessions which enable you to ask questions about resources in the current region but also all other regions in your account.

Conclusion

As a new feature of IAM Identity Center, identity-aware sessions enable an AWS Console user to access their Q Developer Pro subscription in the Q Chat panel. This provides them with richer conversations with Q Developer about their accounts and maintains those conversations over time with stored chat history. Enabling this feature involves no additional cost and only a single setting change in a configured IAM Identity Center organization instance. Once made, users will be able to benefit from the full feature set of Amazon Q Developer regardless of how they log into the account.

About the author

Ryan Yanchuleff, Sr. Specialist SA – WWSO Generative AI

Ryan is a Senior Specialist Solutions Architect for the Worldwide Specialist Organization focused on all things Generative AI. He has a passion for helping startups build new solutions to realize their ideas and has built more than a few startups himself. When he’s not playing with technology, he enjoys spending time with his wife and two kids and working on his next home renovation project.

How to use SES Mail Manager SMTP Relay action to deliver inbound email to Google Workspace and Microsoft 365

Post Syndicated from Zip Zieper original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-use-ses-mail-manager-smtp-relay-action-to-deliver-inbound-email-to-google-workspace-and-microsoft-365/

Introduction

Customers often ask us if the Amazon Simple Email Service (SES) inbound capabilities they use with applications hosted on AWS infrastructure can also be used to process and automate employee email hosted on public services like Google Workspace and Microsoft 365. The answer has typically been “yes, but with some limitations”, as until now, SES inbound has been somewhat constrained by the fact that it didn’t support relaying messages for an existing domain. This limitation makes it very difficult to fully manage email flows across hybrid email environments.

Such conversations led the SES team to create Amazon Simple Email Service (SES) Mail Manager which offers a set of capabilities that simplify managing large volumes of email communications within an organization. Mail Manager’s rules set conditions and actions can optimize routing for improved delivery and communication flow, both for incoming and outgoing emails. Mail Manager’s email security features can be augmented by optional add-ons from industry-leading, vetted third-party providers. Flexible archiving features help organizations meet stringent compliance and record-keeping requirements.

In this blog, we position Mail Manager as a central ingress gateway for a fictitious company, Nutrition.co, that is based on real-world AWS customers. We discuss the customer challenges and explain how to configure Mail Manager’s SMTP Relay action to intercept, archive then deliver emails destined for employees’ Google Workspace hosted Gmail and Microsoft 365 hosted Exchange Online mailboxes. Similar mail flows can be used to process, automate and archive emails destined for their AWS hosted apps.

You can learn more about all of Mail Manager’s capabilities here.

Customer background and use case

Our fictitious company, Nutrition.co, is an online retail business with multiple employee departments, including administration, marketing, sales and fulfillment. The company has acquired several smaller rivals that use both Google Workspace and Microsoft 365 to host their employee inboxes, and plan to consolidate all users onto the same domain ( such as [email protected] and [email protected]). They also host several applications on Amazon Web Services (AWS) that use Amazon SES’ inbound capability to receive emails using a subdomain *customer-support*.nutrition.co, such as orders@*customer-support*.nutrition.co and returns@*customer-support*.nutrition.co.

Nutrition.co is looking for a solution to unify all their email domain routing, security and archiving processes onto one centralized management system to simplify their email infrastructure. They want an approach that provides more flexibility to control which addresses and domains are used for apps and automation as well as employee mail. They also want to enhance email compliance and governance with a flexible solution for screening, processing and archiving inbound emails to both employees and applications, before delivering those emails to recipient inboxes on Google Workspace and Microsoft 365 and applications hosted on AWS.

The SES Mail Manger based central ingress and egress gateway architecture we propose will allow Nutrition.co to manage their peer-to-peer and application-driven emails in one place, Amazon SES. It will simplify email security and management, and make it easy to unlock new cloud-enabled email use cases. The architecture can be modified to acommodate a wide variety of email infrastructure, including fully cloud hosted, on-premises, and hybrid mailbox hosting environments.

What is an Inbound SMTP Gateway?

An Inbound SMTP Gateway is an SMTP server that accepts inbound email via an Open Ingress Point, and then delivers those messages to another email environment’s inbound SMTP server. In the diagram below, Mail Manger is configured as an inbound SMTP Gateway:

Figure 1: Diagram of the inbound gateway mail flow to a mailbox hosting environment

Figure 1: Diagram of the inbound gateway mail flow to a mailbox hosting environment

“Inbound email” refers to email traffic flows where the originator of the message can be either a trusted (for example: the UK division of Nutrition.co) or an untrusted (for example: a Nutrition.co customer or vendor) entity. To send an email, the originating email system looks up the recipient domain’s MX record in the global DNS system to determine the address for the recipient’s inbound mail server. Once a connection is established on port 25, the originating server delivers the email message using the SMTP protocol typically using STARTTLS for transport level encryption. Inbound messages are typically authenticated using the SPF, DKIM, and DMARC industry standard protocols, which help ensure the messages are coming from the legitimate sender’s domain.

An Inbound SMTP gateway can act on messages, for example to process and/or archive, before passing them along to the end recipient’s email server. To learn more about archiving emails in transit, visit this blog.

Configuring Mail Manager as an Inbound SMTP Gateway

Before we can configure Mail Manager as an Inbound Gateway for Nutrition.co’s Google Workspace and Microsoft 365 hosted mailboxes, we need to “allow-list” Mail Manager in Nutrition.co’s Google Workspace and Microsoft 365 settings. Allow-listing in this context refers to configuring the hosted mailbox environments such that Mail Manager is not identified as the source of messages, but rather as an SMTP relay.

This configuration is necessary because the messages being relayed through Mail Manager originate from both trusted and untrusted senders. This mail flow will contain both wanted and, potentially, unwanted messages. Mail Manager is the intermediary, not the source of potentially unwanted email passing through Mail Manager’s Open Ingress Point before being relayed to the destination mailbox environment.

If Mail Manager is not allow-listed, inbound email that is relayed thru Mail Manager’s Open Ingress Point will fail SPF checks because the IP addresses of the intermediary server are not authorized by the domain’s SPF policy. Since DMARC relies on SPF, messages from intermediary mail servers will fail the domain’s DMARC policy if they are not signed with a domain-aligned DKIM signature.

Mailbox hosting environments and their anti-spam algorithms rely on SPF, DKIM and DMARC for authenticating different inbound mail flow configurations before making an assessment about the message’s disposition. Properly authenticated messages, if not otherwise identified as unwanted by recipients and their security administrator, are delivered to Inboxes. Messages that are not authenticated are more likely to be treated as spam. Messages from intermediary servers can sometimes be mistaken as spoofed or unwanted messages.

By allow-listing the egress IP addresses of the Mail Manager servers, Nutrition.co’s Google Workspace and Microsoft 365 hosting environments will be able to assess the correct SPF result when receiving inbound email from Mail Manager.

Note: Do not include Mail Manager’s IP addresses in the domain’s SPF policy, These IP addresses are shared by other Mail Manager customers so including them in the domain’s SPF policy can introduce a security risk.

Note: It is also possible to use DKIM and ARC for allow-listing mail streams, but Gmail and Exchange Online both support IP allow-listing.

Note: Nutrition.co’s Google Workspace and Microsoft 365 hosting environments may still make a spam assessment about the messages under the context that Mail Manager is not the original sender, but this is not common.

Figure 2: Diagram of the SES Mail Manager architecture to accept inbound email via an open Ingress endpoint and configured with a Rule set condition to relay messages with the SMTP Relay action.

Figure 2: Diagram of the SES Mail Manager architecture to accept inbound email via an open Ingress endpoint and configured with a Rule set condition to relay messages with the SMTP Relay action.

In the diagram above, the interaction points are as follows:

1. Email senders look in DNS to discover the MX record for example.com.
2. The value of the domain’s MX record is the A record of the Mail Manager Ingress endpoint. The Ingress endpoint is configured as an ‘open’ Ingress endpoint so that it can receive inbound email without requiring SMTP Auth
3. The Ingress endpoint traffic policy is configured to allow and deny traffic
4. The Rule Set conditions determine which messages are to be relayed
5. The SMTP Relay action relays messages for recipients that are SES verified identities

Configuring Mail Manager as an Inbound SMTP Gateway

Prerequisites

  • Access to the administrative console for Nutrition.co’s Google Workspace and Microsoft 365 hosted mailboxes
  • Access to the DNS zone hosting the MX records for the Nutrition.co’s domains

Step 1: Allow-list the regional Mail Manager IP addresses in Nutrition.co’s Google Workspace and Microsoft 365, and create the Mail Manager relay action(s) in AWS SES console.

  • If you do not configure the allow-list Nutrition.co’s Google Workspace and Microsoft 365 hosted, it may cause those mailbox providers to reject as spam or send to junk the emails replayed from your Mail Manager environment.

Step 1-a: Follow the instructions to allow-list Mail Manager to relay email to Nutrition.co’s Google Workspace and Microsoft 365 environments.

Step1-b: Create an SMTP relay for your mailbox hosting environment

* See Creating an SMTP relay in the SES console

Figure 3: Screenshot of an SMTP Relay rule action configured for Microsoft 365 Exchange Online inbound receiving

Figure 3: Screenshot of an SMTP Relay rule action configured for Microsoft 365 Exchange Online inbound receiving

Figure 4: Screenshot of an SMTP Relay rule action configured for Google Workspaces Gmail inbound receiving

Figure 4: Screenshot of an SMTP Relay rule action configured for Google Workspaces Gmail inbound receiving

Because Nutrition.co hosts email in both Google Workspace and Microsoft 365, we must create SMTP Relay actions for both.

Step 2: In SES console, verify Nutrition.co’s email domain, which is nutrition.co

SES needs to prove that Nutrition.co owns the domain of each of the recipient addresses before it will begin relaying inbound email. If you cannot verify ownership of the recipient email destinations, SES will not relay messages.

Follow the instructions to verify Nutrition.co’s SES domain identity for the recipient email addresses within Nutrition.co’s Google Workspace and Microsoft 365 environments. (*note that subdomains such as customer-support.nutrition.co inherit verification from the parent domain*).

Figure 5: Screenshot of a successfully verified domain in the SES console.

Figure 5: Screenshot of a successfully verified domain in the SES console.

Step 3: Configure Mail Manager with an Open Ingress Point and Rule Set Action to relay inbound email to the mailbox hosting environment.

Step 3-a: See Create a Traffic Policy to accept inbound email from the internet.

  • Default action: Allow
    (Optional) Add Policy statements, depending on your requirements. Choose the action to be taken when the filter conditions are met: Deny

    • While Nutrition.co does not want to apply additional security via the SMTP Relay gateway, Mail Manager supports both native capabilities and optional add-on subscriptions to 3rd party tools from vetted industry leaders such as Spamhaus and Abusix.
Figure 6: Screenshot of a traffic policy for accepting all email from the internet

Figure 6: Screenshot of a traffic policy for accepting all email from the internet

Step 3-b: Follow the instructions for creating rule sets and rules in the SES console.

  • Select the SMTP Relay that you created in Step 1-b and enable the **Preserve Mail From** option.
    • The ‘Preserve Mail From’ setting is necessary so that the mailbox provider can be configured to make the correct assessment of the message’s SPF policy evaluation, assuming that the allow-list configuration Step 1 is complete.
  • Add any conditions and exceptions for each rule, depending on your needs.
    • You may want to create a condition for the SMTP Relay rule so that only messages destined for recipients within your domain are relayed to the appropriate SMTP Relay action, and choose a different action for the recipients who are not hosted in your environment, such as the Archive action.
    • If you have both Google Workspace and Microsoft 365 configured as SMTP Relay destinations, you may combine the SMTP Relay actions in a single rule if the conditions are the same, or create them as separate rules if the conditions need to be different
Figure 7: A Mail Manager rule configured with an SMTP Relay action for Google Workspaces and another SMTP Relay actions for Microsoft 365

Figure 7: A Mail Manager rule configured with an SMTP Relay action for Google Workspaces and another SMTP Relay actions for Microsoft 365

Step 3-c: Follow the documentation for Creating an Ingress Point.

The Mail Manager Ingress point needs to be ‘Open“ for this use case because internet mail senders need to connect to port 25 and send without SMTP authentication for inbound mail flows.

  • Type: Open
    Traffic policy: Choose the traffic policy that you created step 3-a
    Rule set: Choose the rule set that you created in step 3-b

After saving the ingress endpoint settings, you should see something similar in the console.

Figure 8: Screenshot of an ‘open’ Mail Manager Ingress endpoint configured with a rule set and traffic policy

Figure 8: Screenshot of an ‘open’ Mail Manager Ingress endpoint configured with a rule set and traffic policy

Step 4. Verify your configuration and change your domain’s MX record

Once you have finished configuring Mail Manager with an Inbound Gateway configuration you will have:

  • An Open ingress point that does not require authentication and has an open traffic policy to allow messages from the internet.
  • A Rule set with SMTP Relay actions that will relay inbound messages to Google Workspace and/or Microsoft 365.

Step 4-a: Test your configuration

  • Ingress point: You can test that the Ingress endpoint receives email by using an SMTP capable client application, such as “openssl s_client” from a host that allows for outbound port 25 connections to the A Record of your Open Ingress Point (many ISPs and cloud infrastructure providers block port 25 by default to stop the proliferation of spam on the internet). If you get a “250 OK” response from the SMTP transaction, the Ingress point is configured correctly.
  • Rule set: You can test your Rule set by sending a message to your Ingress endpoint that has a recipient destination that is both a verified domain, and a domain that is hosted by your mailbox environment. You may want to add the Archive and/or Save to S3 rule actions to occur prior to SMTP Relay. This enables you to view message headers and diagnose issues that may occur during the SMTP relay to the mailbox hosting environments.
  • Final delivery: You can test the entire mail flow by looking at the received messages in your mailbox hosting environment.
    • How to look at received messages in a mailbox hosting environment
      • Google Workspace – From within the Gmail interface, find the message and open the message menu options.
      • Figure 9: Screenshot of Gmail’s interface for selecting message options
      • Choose “Show original”.
      • Screenshot of Gmail’s “Original message” summary showing SPF and DKIM passing and aligned with gmail.com, which was the source of the original message
      • Screenshot of the Gmail ‘Show original“ message headers. The Mail From address (also appears as the Return-path header, and envelope-from value in other headers) is preserved within the @gmail.com domain, and Gmail’s assessment of SPF correctly attributed the message as originating from 209.85.216.51 even though the message was relayed through 206.55.129.47. Since the 209.x.x.x address is in the SPF policy for gmail.com, the message passes SPF due to the allow-list configuration
      • (The Screenshot above shows the Gmail ‘Show original“ message headers. The Mail From address (also appears as the Return-path header, and envelope-from value in other headers) is preserved within the @gmail.com domain, and Gmail’s assessment of SPF correctly attributed the message as originating from 209.85.216.51 even though the message was relayed through 206.55.129.47. Since the 209.x.x.x address is in the SPF policy for gmail.com, the message passes SPF due to the allow-list configuration)
      • Microsoft 365 – From within the Outlook on the Web interface, find the message and open the message menu options.
      • Screenshot of Outlook on the Web’s interface for selecting message options
      • Choose “View message details”. You will see the message headers similar to the Gmail example above.

Step 4-b: Change the MX record for your domain.

Note: We recommend using a new subdomain so that you can test this mail flow configuration for a period of time prior to changing the MX record for the primary domain that is actively being used by end users and applications.

Once you have finished testing, you can change the MX record for the domain. The value of the MX record should be the **A Record** of the Open Ingress point along with the priority value.

Figure 13: A screenshot of an MX record configured in Amazon Route 53

Figure 13: A screenshot of an MX record configured in Amazon Route 53

Conclusion

In this blog post, we’ve explored how to leverage SES Mail Manager’s SMTP Relay action to simplify the handling of inbound email for organizations that use a mix of email hosting environments, specifically Google Workspace and Microsoft 365. By configuring Mail Manager as an inbound SMTP gateway, our fictitious customer, Nutrition.co was able to centralize the management of their email flows, enhance security through features like traffic policies and rule sets, and ensure compliance through flexible archiving.

The key steps involved setting up allow-listing in the Google Workspace and Microsoft 365 environments, creating SMTP relay configurations in Mail Manager, and updating Nutrition.co domain’s MX record to point to the Mail Manager ingress endpoint. This allowed Nutrition.co to seamlessly route inbound emails destined for both their cloud-hosted employee mailboxes and on-premises applications, processing and archiving the messages before final delivery.

The flexibility of Mail Manager’s SMTP Relay action makes it a powerful tool for organizations looking to unify their email infrastructure, especially in hybrid environments. By acting as a centralized ingress and egress gateway, Mail Manager can help streamline email management, improve security, and unlock new cloud-enabled email use cases. As email continues to be a critical communication channel, solutions like Mail Manager will become increasingly important for businesses looking to maximize the value of their email ecosystem.

Please visit AWS Re:Post to ask and find answers to questions about SES Mail Manager. Talk with your AWS account team if you are interested in exploring Mail Manager in more depth.

Additional blogs related to Mail Manager:

About the Authors

Jesse Thompson
Jesse Thompson is an Email Deliverability Manager with the Amazon Simple Email Service team. His background is in enterprise development and operations, with a focus on email abuse mitigation and encouragement of authenticity practices with open standard protocols. Jesse’s favorite activity outside of technology is recreational curling.
Alexey Kurbatsky

Alexey Kurbatsky

Alexey is a Senior Software Development Engineer at AWS, specializing in building distributed and scalable services. Outside of work, he enjoys exploring nature thru hiking as well as playing guitar.

Zip

Zip

Zip is a Sr. Specialist Solutions Architect at AWS, working with Amazon Pinpoint and Simple Email Service and WorkMail. Outside of work he enjoys time with his family, cooking, mountain biking, boating, learning and beach plogging.

Best practices for managing Terraform State files in AWS CI/CD Pipeline

Post Syndicated from Arun Kumar Selvaraj original https://aws.amazon.com/blogs/devops/best-practices-for-managing-terraform-state-files-in-aws-ci-cd-pipeline/

Introduction

Today customers want to reduce manual operations for deploying and maintaining their infrastructure. The recommended method to deploy and manage infrastructure on AWS is to follow Infrastructure-As-Code (IaC) model using tools like AWS CloudFormation, AWS Cloud Development Kit (AWS CDK) or Terraform.

One of the critical components in terraform is managing the state file which keeps track of your configuration and resources. When you run terraform in an AWS CI/CD pipeline the state file has to be stored in a secured, common path to which the pipeline has access to. You need a mechanism to lock it when multiple developers in the team want to access it at the same time.

In this blog post, we will explain how to manage terraform state files in AWS, best practices on configuring them in AWS and an example of how you can manage it efficiently in your Continuous Integration pipeline in AWS when used with AWS Developer Tools such as AWS CodeCommit and AWS CodeBuild. This blog post assumes you have a basic knowledge of terraform, AWS Developer Tools and AWS CI/CD pipeline. Let’s dive in!

Challenges with handling state files

By default, the state file is stored locally where terraform runs, which is not a problem if you are a single developer working on the deployment. However if not, it is not ideal to store state files locally as you may run into following problems:

  • When working in teams or collaborative environments, multiple people need access to the state file
  • Data in the state file is stored in plain text which may contain secrets or sensitive information
  • Local files can get lost, corrupted, or deleted

Best practices for handling state files

The recommended practice for managing state files is to use terraform’s built-in support for remote backends. These are:

Remote backend on Amazon Simple Storage Service (Amazon S3): You can configure terraform to store state files in an Amazon S3 bucket which provides a durable and scalable storage solution. Storing on Amazon S3 also enables collaboration that allows you to share state file with others.

Remote backend on Amazon S3 with Amazon DynamoDB: In addition to using an Amazon S3 bucket for managing the files, you can use an Amazon DynamoDB table to lock the state file. This will allow only one person to modify a particular state file at any given time. It will help to avoid conflicts and enable safe concurrent access to the state file.

There are other options available as well such as remote backend on terraform cloud and third party backends. Ultimately, the best method for managing terraform state files on AWS will depend on your specific requirements.

When deploying terraform on AWS, the preferred choice of managing state is using Amazon S3 with Amazon DynamoDB.

AWS configurations for managing state files

  1. Create an Amazon S3 bucket using terraform. Implement security measures for Amazon S3 bucket by creating an AWS Identity and Access Management (AWS IAM) policy or Amazon S3 Bucket Policy. Thus you can restrict access, configure object versioning for data protection and recovery, and enable AES256 encryption with SSE-KMS for encryption control.
  1. Next create an Amazon DynamoDB table using terraform with Primary key set to LockID. You can also set any additional configuration options such as read/write capacity units. Once the table is created, you will configure the terraform backend to use it for state locking by specifying the table name in the terraform block of your configuration.
  1. For a single AWS account with multiple environments and projects, you can use a single Amazon S3 bucket. If you have multiple applications in multiple environments across multiple AWS accounts, you can create one Amazon S3 bucket for each account. In that Amazon S3 bucket, you can create appropriate folders for each environment, storing project state files with specific prefixes.

Now that you know how to handle terraform state files on AWS, let’s look at an example of how you can configure them in a Continuous Integration pipeline in AWS.

Architecture

Architecture on how to use terraform in an AWS CI pipeline

Figure 1: Example architecture on how to use terraform in an AWS CI pipeline

This diagram outlines the workflow implemented in this blog:

  1. The AWS CodeCommit repository contains the application code
  2. The AWS CodeBuild job contains the buildspec files and references the source code in AWS CodeCommit
  3. The AWS Lambda function contains the application code created after running terraform apply
  4. Amazon S3 contains the state file created after running terraform apply. Amazon DynamoDB locks the state file present in Amazon S3

Implementation

Pre-requisites

Before you begin, you must complete the following prerequisites:

Setting up the environment

  1. You need an AWS access key ID and secret access key to configure AWS CLI. To learn more about configuring the AWS CLI, follow these instructions.
  2. Clone the repo for complete example: git clone https://github.com/aws-samples/manage-terraform-statefiles-in-aws-pipeline
  3. After cloning, you could see the following folder structure:
AWS CodeCommit repository structure

Figure 2: AWS CodeCommit repository structure

Let’s break down the terraform code into 2 parts – one for preparing the infrastructure and another for preparing the application.

Preparing the Infrastructure

  1. The main.tf file is the core component that does below:
      • It creates an Amazon S3 bucket to store the state file. We configure bucket ACL, bucket versioning and encryption so that the state file is secure.
      • It creates an Amazon DynamoDB table which will be used to lock the state file.
      • It creates two AWS CodeBuild projects, one for ‘terraform plan’ and another for ‘terraform apply’.

    Note – It also has the code block (commented out by default) to create AWS Lambda which you will use at a later stage.

  1. AWS CodeBuild projects should be able to access Amazon S3, Amazon DynamoDB, AWS CodeCommit and AWS Lambda. So, the AWS IAM role with appropriate permissions required to access these resources are created via iam.tf file.
  1. Next you will find two buildspec files named buildspec-plan.yaml and buildspec-apply.yaml that will execute terraform commands – terraform plan and terraform apply respectively.
  1. Modify AWS region in the provider.tf file.
  1. Update Amazon S3 bucket name, Amazon DynamoDB table name, AWS CodeBuild compute types, AWS Lambda role and policy names to required values using variable.tf file. You can also use this file to easily customize parameters for different environments.

With this, the infrastructure setup is complete.

You can use your local terminal and execute below commands in the same order to deploy the above-mentioned resources in your AWS account.

terraform init
terraform validate
terraform plan
terraform apply

Once the apply is successful and all the above resources have been successfully deployed in your AWS account, proceed with deploying your application. 

Preparing the Application

  1. In the cloned repository, use the backend.tf file to create your own Amazon S3 backend to store the state file. By default, it will have below values. You can override them with your required values.
bucket = "tfbackend-bucket" 
key    = "terraform.tfstate" 
region = "eu-central-1"
  1. The repository has sample python code stored in main.py that returns a simple message when invoked.
  1. In the main.tf file, you can find the below block of code to create and deploy the Lambda function that uses the main.py code (uncomment these code blocks).
data "archive_file" "lambda_archive_file" {
    ……
}

resource "aws_lambda_function" "lambda" {
    ……
}
  1. Now you can deploy the application using AWS CodeBuild instead of running terraform commands locally which is the whole point and advantage of using AWS CodeBuild.
  1. Run the two AWS CodeBuild projects to execute terraform plan and terraform apply again.
  1. Once successful, you can verify your deployment by testing the code in AWS Lambda. To test a lambda function (console):
    • Open AWS Lambda console and select your function “tf-codebuild”
    • In the navigation pane, in Code section, click Test to create a test event
    • Provide your required name, for example “test-lambda”
    • Accept default values and click Save
    • Click Test again to trigger your test event “test-lambda”

It should return the sample message you provided in your main.py file. In the default case, it will display “Hello from AWS Lambda !” message as shown below.

Sample Amazon Lambda function response

Figure 3: Sample Amazon Lambda function response

  1. To verify your state file, go to Amazon S3 console and select the backend bucket created (tfbackend-bucket). It will contain your state file.
Amazon S3 bucket with terraform state file

Figure 4: Amazon S3 bucket with terraform state file

  1. Open Amazon DynamoDB console and check your table tfstate-lock and it will have an entry with LockID.
Amazon DynamoDB table with LockID

Figure 5: Amazon DynamoDB table with LockID

Thus, you have securely stored and locked your terraform state file using terraform backend in a Continuous Integration pipeline.

Cleanup

To delete all the resources created as part of the repository, run the below command from your terminal.

terraform destroy

Conclusion

In this blog post, we explored the fundamentals of terraform state files, discussed best practices for their secure storage within AWS environments and also mechanisms for locking these files to prevent unauthorized team access. And finally, we showed you an example of how efficiently you can manage them in a Continuous Integration pipeline in AWS.

You can apply the same methodology to manage state files in a Continuous Delivery pipeline in AWS. For more information, see CI/CD pipeline on AWS, Terraform backends types, Purpose of terraform state.

Arun Kumar Selvaraj

Arun Kumar Selvaraj is a Cloud Infrastructure Architect with AWS Professional Services. He loves building world class capability that provides thought leadership, operating standards and platform to deliver accelerated migration and development paths for his customers. His interests include Migration, CCoE, IaC, Python, DevOps, Containers and Networking.

Manasi Bhutada

Manasi Bhutada is an ISV Solutions Architect based in the Netherlands. She helps customers design and implement well architected solutions in AWS that address their business problems. She is passionate about data analytics and networking. Beyond work she enjoys experimenting with food, playing pickleball, and diving into fun board games.

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

Post Syndicated from Raymond Lai original https://aws.amazon.com/blogs/big-data/enforce-fine-grained-access-control-on-open-table-formats-via-amazon-emr-integrated-with-aws-lake-formation/

With Amazon EMR 6.15, we launched AWS Lake Formation based fine-grained access controls (FGAC) on Open Table Formats (OTFs), including Apache Hudi, Apache Iceberg, and Delta lake. This allows you to simplify security and governance over transactional data lakes by providing access controls at table-, column-, and row-level permissions with your Apache Spark jobs. Many large enterprise companies seek to use their transactional data lake to gain insights and improve decision-making. You can build a lake house architecture using Amazon EMR integrated with Lake Formation for FGAC. This combination of services allows you to conduct data analysis on your transactional data lake while ensuring secure and controlled access.

The Amazon EMR record server component supports table-, column-, row-, cell-, and nested attribute-level data filtering functionality. It extends support to Hive, Apache Hudi, Apache Iceberg, and Delta lake formats for both reading (including time travel and incremental query) and write operations (on DML statements such as INSERT). Additionally, with version 6.15, Amazon EMR introduces access control protection for its application web interface such as on-cluster Spark History Server, Yarn Timeline Server, and Yarn Resource Manager UI.

In this post, we demonstrate how to implement FGAC on Apache Hudi tables using Amazon EMR integrated with Lake Formation.

Transaction data lake use case

Amazon EMR customers often use Open Table Formats to support their ACID transaction and time travel needs in a data lake. By preserving historical versions, data lake time travel provides benefits such as auditing and compliance, data recovery and rollback, reproducible analysis, and data exploration at different points in time.

Another popular transaction data lake use case is incremental query. Incremental query refers to a query strategy that focuses on processing and analyzing only the new or updated data within a data lake since the last query. The key idea behind incremental queries is to use metadata or change tracking mechanisms to identify the new or modified data since the last query. By identifying these changes, the query engine can optimize the query to process only the relevant data, significantly reducing the processing time and resource requirements.

Solution overview

In this post, we demonstrate how to implement FGAC on Apache Hudi tables using Amazon EMR on Amazon Elastic Compute Cloud (Amazon EC2) integrated with Lake Formation. Apache Hudi is an open source transactional data lake framework that greatly simplifies incremental data processing and the development of data pipelines. This new FGAC feature supports all OTF. Besides demonstrating with Hudi here, we will follow up with other OTF tables with other blogs. We use notebooks in Amazon SageMaker Studio to read and write Hudi data via different user access permissions through an EMR cluster. This reflects real-world data access scenarios—for example, if an engineering user needs full data access to troubleshoot on a data platform, whereas data analysts may only need to access a subset of that data that doesn’t contain personally identifiable information (PII). Integrating with Lake Formation via the Amazon EMR runtime role further enables you to improve your data security posture and simplifies data control management for Amazon EMR workloads. This solution ensures a secure and controlled environment for data access, meeting the diverse needs and security requirements of different users and roles in an organization.

The following diagram illustrates the solution architecture.

Solution architecture

We conduct a data ingestion process to upsert (update and insert) a Hudi dataset to an Amazon Simple Storage Service (Amazon S3) bucket, and persist or update the table schema in the AWS Glue Data Catalog. With zero data movement, we can query the Hudi table governed by Lake Formation via various AWS services, such as Amazon Athena, Amazon EMR, and Amazon SageMaker.

When users submit a Spark job through any EMR cluster endpoints (EMR Steps, Livy, EMR Studio, and SageMaker), Lake Formation validates their privileges and instructs the EMR cluster to filter out sensitive data such as PII data.

This solution has three different types of users with different levels of permissions to access the Hudi data:

  • hudi-db-creator-role – This is used by the data lake administrator who has privileges to carry out DDL operations such as creating, modifying, and deleting database objects. They can define data filtering rules on Lake Formation for row-level and column-level data access control. These FGAC rules ensure that data lake is secured and fulfills the data privacy regulations required.
  • hudi-table-pii-role – This is used by engineering users. The engineering users are capable of carrying out time travel and incremental queries on both Copy-on-Write (CoW) and Merge-on-Read (MoR). They also have privilege to access PII data based on any timestamps.
  • hudi-table-non-pii-role – This is used by data analysts. Data analysts’ data access rights are governed by FGAC authorized rules controlled by data lake administrators. They do not have visibility on columns containing PII data like names and addresses. Additionally, they can’t access rows of data that don’t fulfill certain conditions. For example, the users only can access data rows that belong to their country.

Prerequisites

You can download the three notebooks used in this post from the GitHub repo.

Before you deploy the solution, make sure you have the following:

Complete the following steps to set up your permissions:

  1. Log in to your AWS account with your admin IAM user.

Make sure you are in theus-east-1Region.

  1. Create a S3 bucket in the us-east-1 Region (for example,emr-fgac-hudi-us-east-1-<ACCOUNT ID>).

Next, we enable Lake Formation by changing the default permission model.

  1. Sign in to the Lake Formation console as the administrator user.
  2. Choose Data Catalog settings under Administration in the navigation pane.
  3. Under Default permissions for newly created databases and tables, deselect Use only IAM access control for new databases and Use only IAM access control for new tables in new databases.
  4. Choose Save.

Data Catalog settings

Alternatively, you need to revoke IAMAllowedPrincipals on resources (databases and tables) created if you started Lake Formation with the default option.

Finally, we create a key pair for Amazon EMR.

  1. On the Amazon EC2 console, choose Key pairs in the navigation pane.
  2. Choose Create key pair.
  3. For Name, enter a name (for exampleemr-fgac-hudi-keypair).
  4. Choose Create key pair.

Create key pair

The generated key pair (for this post, emr-fgac-hudi-keypair.pem) will save to your local computer.

Next, we create an AWS Cloud9 interactive development environment (IDE).

  1. On the AWS Cloud9 console, choose Environments in the navigation pane.
  2. Choose Create environment.
  3. For Name¸ enter a name (for example,emr-fgac-hudi-env).
  4. Keep the other settings as default.

Cloud9 environment

  1. Choose Create.
  2. When the IDE is ready, choose Open to open it.

cloud9 environment

  1. In the AWS Cloud9 IDE, on the File menu, choose Upload Local Files.

Upload local file

  1. Upload the key pair file (emr-fgac-hudi-keypair.pem).
  2. Choose the plus sign and choose New Terminal.

new terminal

  1. In the terminal, input the following command lines:
#Create encryption certificates for EMR in transit encryption
openssl req -x509 \
-newkey rsa:1024 \
-keyout privateKey.pem \
-out certificateChain.pem \
-days 365 \
-nodes \
-subj '/C=US/ST=Washington/L=Seattle/O=MyOrg/OU=MyDept/CN=*.compute.internal'
cp certificateChain.pem trustedCertificates.pem

# Zip certificates
zip -r -X my-certs.zip certificateChain.pem privateKey.pem trustedCertificates.pem

# Upload the certificates zip file to S3 bucket
# Replace <ACCOUNT ID> with your AWS account ID
aws s3 cp ./my-certs.zip s3://emr-fgac-hudi-us-east-1-<ACCOUNT ID>/my-certs.zip

Note that the example code is a proof of concept for demonstration purposes only. For production systems, use a trusted certification authority (CA) to issue certificates. Refer to Providing certificates for encrypting data in transit with Amazon EMR encryption for details.

Deploy the solution via AWS CloudFormation

We provide an AWS CloudFormation template that automatically sets up the following services and components:

  • An S3 bucket for the data lake. It contains the sample TPC-DS dataset.
  • An EMR cluster with security configuration and public DNS enabled.
  • EMR runtime IAM roles with Lake Formation fine-grained permissions:
    • <STACK-NAME>-hudi-db-creator-role – This role is used to create Apache Hudi database and tables.
    • <STACK-NAME>-hudi-table-pii-role – This role provides permission to query all columns of Hudi tables, including columns with PII.
    • <STACK-NAME>-hudi-table-non-pii-role – This role provides permission to query Hudi tables that have filtered out PII columns by Lake Formation.
  • SageMaker Studio execution roles that allow the users to assume their corresponding EMR runtime roles.
  • Networking resources such as VPC, subnets, and security groups.

Complete the following steps to deploy the resources:

  1. Choose Quick create stack to launch the CloudFormation stack.
  2. For Stack name, enter a stack name (for example,rsv2-emr-hudi-blog).
  3. For Ec2KeyPair, enter the name of your key pair.
  4. For IdleTimeout, enter an idle timeout for the EMR cluster to avoid paying for the cluster when it’s not being used.
  5. For InitS3Bucket, enter the S3 bucket name you created to save the Amazon EMR encryption certificate .zip file.
  6. For S3CertsZip, enter the S3 URI of the Amazon EMR encryption certificate .zip file.

CloudFormation template

  1. Select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
  2. Choose Create stack.

The CloudFormation stack deployment takes around 10 minutes.

Set up Lake Formation for Amazon EMR integration

Complete the following steps to set up Lake Formation:

  1. On the Lake Formation console, choose Application integration settings under Administration in the navigation pane.
  2. Select Allow external engines to filter data in Amazon S3 locations registered with Lake Formation.
  3. Choose Amazon EMR for Session tag values.
  4. Enter your AWS account ID for AWS account IDs.
  5. Choose Save.

LF - Application integration settings

  1. Choose Databases under Data Catalog in the navigation pane.
  2. Choose Create database.
  3. For Name, enter default.
  4. Choose Create database.

LF - create database

  1. Choose Data lake permissions under Permissions in the navigation pane.
  2. Choose Grant.
  3. Select IAM users and roles.
  4. Choose your IAM roles.
  5. For Databases, choose default.
  6. For Database permissions, select Describe.
  7. Choose Grant.

LF - Grant data permissions

Copy Hudi JAR file to Amazon EMR HDFS

To use Hudi with Jupyter notebooks, you need to complete the following steps for the EMR cluster, which includes copying a Hudi JAR file from the Amazon EMR local directory to its HDFS storage, so that you can configure a Spark session to use Hudi:

  1. Authorize inbound SSH traffic (port 22).
  2. Copy the value for Primary node public DNS (for example, ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com) from the EMR cluster Summary section.

EMR cluster summary

  1. Go back to previous AWS Cloud9 terminal you used to create the EC2 key pair.
  2. Run the following command to SSH into the EMR primary node. Replace the placeholder with your EMR DNS hostname:
chmod 400 emr-fgac-hudi-keypair.pem
ssh -i emr-fgac-hudi-keypair.pem [email protected]
  1. Run the following command to copy the Hudi JAR file to HDFS:
hdfs dfs -mkdir -p /apps/hudi/lib
hdfs dfs -copyFromLocal /usr/lib/hudi/hudi-spark-bundle.jar /apps/hudi/lib/hudi-spark-bundle.jar

Create the Hudi database and tables in Lake Formation

Now we’re ready to create the Hudi database and tables with FGAC enabled by the EMR runtime role. The EMR runtime role is an IAM role that you can specify when you submit a job or query to an EMR cluster.

Grant database creator permission

First, let’s grant the Lake Formation database creator permission to<STACK-NAME>-hudi-db-creator-role:

  1. Log in to your AWS account as an administrator.
  2. On the Lake Formation console, choose Administrative roles and tasks under Administration in the navigation pane.
  3. Confirm that your AWS login user has been added as a data lake administrator.
  4. In the Database creator section, choose Grant.
  5. For IAM users and roles, choose<STACK-NAME>-hudi-db-creator-role.
  6. For Catalog permissions, select Create database.
  7. Choose Grant.

Register the data lake location

Next, let’s register the S3 data lake location in Lake Formation:

  1. On the Lake Formation console, choose Data lake locations under Administration in the navigation pane.
  2. Choose Register location.
  3. For Amazon S3 path, Choose Browse and choose the data lake S3 bucket. (<STACK_NAME>s3bucket-XXXXXXX) created from the CloudFormation stack.
  4. For IAM role, choose<STACK-NAME>-hudi-db-creator-role.
  5. For Permission mode, select Lake Formation.
  6. Choose Register location.

LF - Register location

Grant data location permission

Next, we need to grant<STACK-NAME>-hudi-db-creator-rolethe data location permission:

  1. On the Lake Formation console, choose Data locations under Permissions in the navigation pane.
  2. Choose Grant.
  3. For IAM users and roles, choose<STACK-NAME>-hudi-db-creator-role.
  4. For Storage locations, enter the S3 bucket (<STACK_NAME>-s3bucket-XXXXXXX).
  5. Choose Grant.

LF - Grant permissions

Connect to the EMR cluster

Now, let’s use a Jupyter notebook in SageMaker Studio to connect to the EMR cluster with the database creator EMR runtime role:

  1. On the SageMaker console, choose Domains in the navigation pane.
  2. Choose the domain<STACK-NAME>-Studio-EMR-LF-Hudi.
  3. On the Launch menu next to the user profile<STACK-NAME>-hudi-db-creator, choose Studio.

SM - Domain details

  1. Download the notebook rsv2-hudi-db-creator-notebook.
  2. Choose the upload icon.

SM Studio - Upload

  1. Choose the downloaded Jupyter notebook and choose Open.
  2. Open the uploaded notebook.
  3. For Image, choose SparkMagic.
  4. For Kernel, choose PySpark.
  5. Leave the other configurations as default and choose Select.

SM Studio - Change environment

  1. Choose Cluster to connect to the EMR cluster.

SM Studio - connect EMR cluster

  1. Choose the EMR on EC2 cluster (<STACK-NAME>-EMR-Cluster) created with the CloudFormation stack.
  2. Choose Connect.
  3. For EMR execution role, choose<STACK-NAME>-hudi-db-creator-role.
  4. Choose Connect.

Create database and tables

Now you can follow the steps in the notebook to create the Hudi database and tables. The major steps are as follows:

  1. When you start the notebook, configure“spark.sql.catalog.spark_catalog.lf.managed":"true"to inform Spark that spark_catalog is protected by Lake Formation.
  2. Create Hudi tables using the following Spark SQL.
%%sql 
CREATE TABLE IF NOT EXISTS ${hudi_catalog}.${hudi_db}.${cow_table_name_sql}(
    c_customer_id string,
    c_birth_country string,
    c_customer_sk integer,
    c_email_address string,
    c_first_name string,
    c_last_name string,
    ts bigint
) USING hudi
LOCATION '${cow_table_location_sql}'
OPTIONS (
  type = 'cow',
  primaryKey = '${hudi_primary_key}',
  preCombineField = '${hudi_pre_combined_field}'
 ) 
PARTITIONED BY (${hudi_partitioin_field});

  1. Insert data from the source table to the Hudi tables.
%%sql
INSERT OVERWRITE ${hudi_catalog}.${hudi_db}.${cow_table_name_sql}
SELECT 
    c_customer_id ,  
    c_customer_sk,
    c_email_address,
    c_first_name,
    c_last_name,
    unix_timestamp(current_timestamp()) AS ts,
    c_birth_country
FROM ${src_df_view}
WHERE c_birth_country = 'HONG KONG' OR c_birth_country = 'CHINA' 
LIMIT 1000
  1. Insert data again into the Hudi tables.
%%sql
INSERT INTO ${hudi_catalog}.${hudi_db}.${cow_table_name_sql}
SELECT 
    c_customer_id ,  
    c_customer_sk,
    c_email_address,
    c_first_name,
    c_last_name,
    unix_timestamp(current_timestamp()) AS ts,
    c_birth_country
FROM ${insert_into_view}

Query the Hudi tables via Lake Formation with FGAC

After you create the Hudi database and tables, you’re ready to query the tables using fine-grained access control with Lake Formation. We have created two types of Hudi tables: Copy-On-Write (COW) and Merge-On-Read (MOR). The COW table stores data in a columnar format (Parquet), and each update creates a new version of files during a write. This means that for every update, Hudi rewrites the entire file, which can be more resource-intensive but provides faster read performance. MOR, on the other hand, is introduced for cases where COW may not be optimal, particularly for write- or change-heavy workloads. In a MOR table, each time there is an update, Hudi writes only the row for the changed record, which reduces cost and enables low-latency writes. However, the read performance might be slower compared to COW tables.

Grant table access permission

We use the IAM role<STACK-NAME>-hudi-table-pii-roleto query Hudi COW and MOR containing PII columns. We first grant the table access permission via Lake Formation:

  1. On the Lake Formation console, choose Data lake permissions under Permissions in the navigation pane.
  2. Choose Grant.
  3. Choose<STACK-NAME>-hudi-table-pii-rolefor IAM users and roles.
  4. Choose thersv2_blog_hudi_db_1database for Databases.
  5. For Tables, choose the four Hudi tables you created in the Jupyter notebook.

LF - Grant data permissions

  1. For Table permissions, select Select.
  2. Choose Grant.

LF - table permissions

Query PII columns

Now you’re ready to run the notebook to query the Hudi tables. Let’s follow similar steps to the previous section to run the notebook in SageMaker Studio:

  1. On the SageMaker console, navigate to the<STACK-NAME>-Studio-EMR-LF-Hudidomain.
  2. On the Launch menu next to the<STACK-NAME>-hudi-table-readeruser profile, choose Studio.
  3. Upload the downloaded notebook rsv2-hudi-table-pii-reader-notebook.
  4. Open the uploaded notebook.
  5. Repeat the notebook setup steps and connect to the same EMR cluster, but use the role<STACK-NAME>-hudi-table-pii-role.

In the current stage, FGAC-enabled EMR cluster needs to query Hudi’s commit time column for performing incremental queries and time travel. It does not support Spark’s “timestamp as of” syntax and Spark.read(). We are actively working on incorporating support for both actions in future Amazon EMR releases with FGAC enabled.

You can now follow the steps in the notebook. The following are some highlighted steps:

  1. Run a snapshot query.
%%sql 
SELECT c_birth_country, count(*) FROM ${hudi_catalog}.${hudi_db}.${cow_table_name_sql} GROUP BY c_birth_country;
  1. Run an incremental query.
incremental_df = spark.sql(f"""
SELECT * FROM {HUDI_CATALOG}.{HUDI_DATABASE}.{COW_TABLE_NAME_SQL} WHERE _hoodie_commit_time >= {commit_ts[-1]}
""")

incremental_df.createOrReplaceTempView("incremental_view")
%%sql
SELECT 
    c_birth_country, 
    count(*) 
FROM incremental_view
GROUP BY c_birth_country;
  1. Run a time travel query.
%%sql
SELECT
    c_birth_country, COUNT(*) as count
FROM ${hudi_catalog}.${hudi_db}.${cow_table_name_sql}
WHERE _hoodie_commit_time IN
(
    SELECT DISTINCT _hoodie_commit_time FROM ${hudi_catalog}.${hudi_db}.${cow_table_name_sql} ORDER BY _hoodie_commit_time LIMIT 1 
)
GROUP BY c_birth_country
  1. Run MOR read-optimized and real-time table queries.
%%sql
SELECT
    a.email_label,
    count(*)
FROM (
    SELECT
        CASE
            WHEN c_email_address = 'UNKNOWN' THEN 'UNKNOWN'
            ELSE 'NOT_UNKNOWN'
        END AS email_label
    FROM ${hudi_catalog}.${hudi_db}.${mor_table_name_sql}_ro
    WHERE c_birth_country = 'HONG KONG'
) a
GROUP BY a.email_label;
%%sql
SELECT *  
FROM ${hudi_catalog}.${hudi_db}.${mor_table_name_sql}_ro
WHERE 
    c_birth_country = 'INDIA' OR c_first_name = 'MASKED'

Query the Hudi tables with column-level and row-level data filters

We use the IAM role<STACK-NAME>-hudi-table-non-pii-roleto query Hudi tables. This role is not allowed to query any columns containing PII. We use the Lake Formation column-level and row-level data filters to implement fine-grained access control:

  1. On the Lake Formation console, choose Data filters under Data Catalog in the navigation pane.
  2. Choose Create new filter.
  3. For Data filter name, entercustomer-pii-filter.
  4. Choosersv2_blog_hudi_db_1for Target database.
  5. Choosersv2_blog_hudi_mor_sql_dl_customer_1for Target table.
  6. Select Exclude columns and choose thec_customer_id,c_email_address, andc_last_namecolumns.
  7. Enterc_birth_country != 'HONG KONG'for Row filter expression.
  8. Choose Create filter.

LF - create data filter

  1. Choose Data lake permissions under Permissions in the navigation pane.
  2. Choose Grant.
  3. Choose<STACK-NAME>-hudi-table-non-pii-rolefor IAM users and roles.
  4. Choosersv2_blog_hudi_db_1for Databases.
  5. Choosersv2_blog_hudi_mor_sql_dl_tpc_customer_1for Tables.
  6. Choosecustomer-pii-filterfor Data filters.
  7. For Data filter permissions, select Select.
  8. Choose Grant.

LF - Grant data permissions

Let’s follow similar steps to run the notebook in SageMaker Studio:

  1. On the SageMaker console, navigate to the domainStudio-EMR-LF-Hudi.
  2. On the Launch menu for thehudi-table-readeruser profile, choose Studio.
  3. Upload the downloaded notebook rsv2-hudi-table-non-pii-reader-notebook and choose Open.
  4. Repeat the notebook setup steps and connect to the same EMR cluster, but select the role<STACK-NAME>-hudi-table-non-pii-role.

You can now follow the steps in the notebook. From the query results, you can see that FGAC via the Lake Formation data filter has been applied. The role can’t see the PII columnsc_customer_id,c_last_name, andc_email_address. Also, the rows fromHONG KONGhave been filtered.

filtered query result

Clean up

After you’re done experimenting with the solution, we recommend cleaning up resources with the following steps to avoid unexpected costs:

  1. Shut down the SageMaker Studio apps for the user profiles.

The EMR cluster will be automatically deleted after the idle timeout value.

  1. Delete the Amazon Elastic File System (Amazon EFS) volume created for the domain.
  2. Empty the S3 buckets created by the CloudFormation stack.
  3. On the AWS CloudFormation console, delete the stack.

Conclusion

In this post, we used Apachi Hudi, one type of OTF tables, to demonstrate this new feature to enforce fine-grained access control on Amazon EMR. You can define granular permissions in Lake Formation for OTF tables and apply them via Spark SQL queries on EMR clusters. You also can use transactional data lake features such as running snapshot queries, incremental queries, time travel, and DML query. Please note that this new feature covers all OTF tables.

This feature is launched starting from Amazon EMR release 6.15 in all Regions where Amazon EMR is available. With the Amazon EMR integration with Lake Formation, you can confidently manage and process big data, unlocking insights and facilitating informed decision-making while upholding data security and governance.

To learn more, refer to Enable Lake Formation with Amazon EMR and feel free to contact your AWS Solutions Architects, who can be of assistance alongside your data journey.


About the Author

Raymond LaiRaymond Lai is a Senior Solutions Architect who specializes in catering to the needs of large enterprise customers. His expertise lies in assisting customers with migrating intricate enterprise systems and databases to AWS, constructing enterprise data warehousing and data lake platforms. Raymond excels in identifying and designing solutions for AI/ML use cases, and he has a particular focus on AWS Serverless solutions and Event Driven Architecture design.

Bin Wang, PhD, is a Senior Analytic Specialist Solutions Architect at AWS, boasting over 12 years of experience in the ML industry, with a particular focus on advertising. He possesses expertise in natural language processing (NLP), recommender systems, diverse ML algorithms, and ML operations. He is deeply passionate about applying ML/DL and big data techniques to solve real-world problems.

Aditya Shah is a Software Development Engineer at AWS. He is interested in Databases and Data warehouse engines and has worked on performance optimisations, security compliance and ACID compliance for engines like Apache Hive and Apache Spark.

Melody Yang is a Senior Big Data Solution Architect for Amazon EMR at AWS. She is an experienced analytics leader working with AWS customers to provide best practice guidance and technical advice in order to assist their success in data transformation. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

Meet the final cohort of AWS Heroes this year – November 2023

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/meet-the-final-cohort-of-aws-heroes-this-year-november-2023/

As 2023 comes to an end, we’re celebrating our final Heroes cohort launch of the year! These technical experts are passionate about helping their local communities build faster on AWS—they’re focused on sharing best practices, solving problems, and even more. We’re thrilled to have them join the AWS Heroes program, and recognizing them for their contributions to the greater AWS community.

Please meet our newest Heroes!

Emin Alemdar – Izmir, Turkey

Container Hero Emin Alemdar is a Solutions Architect at Spacelift where he produces solutions related to Kubernetes, Cloud technologies, and Cloud Native Transformation. In general, his work focuses on these services and technologies, and he shares best practices with the AWS community. Additionally, Emin is a CNCF Ambassador and is part of the HashiCorp Ambassador Program within the open source community.

Richard Fan – Hong Kong

Security Hero Richard Fan is a Security Engineer at ExpressVPN. He is dedicated to helping builders easily adopt AWS, and shares best practices around streamlines for cloud governance. Richard has also developed different tools to simplify the experience with AWS security services, such as his nitro-enclave-python-demo project, which helps builders get started on AWS Nitro Enclaves and has been adopted by some AWS workshops. Furthermore, Richard promotes the concept and use cases of enclave technology by partnering with multiple companies to review their AWS Nitro Enclaves offering.

Takuya Tachibana – Misawa, Japan

Community Hero Takuya Tachibana is the CEO of Heptagon inc. and Director of DigitalCube Co. Ltd. Since 2012, he has been contributing to the Japan AWS User Group (JAWS-UG), and he was the leader from 2016-2017 for all of the JAWS-UGs, representing and overseeing every chapter in Japan. Tachibana has been a speaker at over 100 community and cloud events throughout all of Japan, including AWS Summit Seoul, AWS Summit Beijing, and AWS Community Day APAC.

Learn More

To learn more about the AWS Heroes program or to connect with a Hero near you, please visit the AWS Heroes website.

Taylor

Announcing the latest AWS Heroes – June 2023

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/announcing-the-latest-aws-heroes-june-2023/

AWS Heroes dedicate their time to help others build better and faster on AWS. Heroes support and give back to the community in a variety of ways: contributing to open source projects, organizing AWS Community Days, speaking at conferences, leading workshops, mentoring builders, hosting meetups, and much more.

Please welcome and say hello to our newest AWS Heroes!

AJ Stuyvenberg – Boston, USA

Serverless Hero AJ Stuyvenberg is a Staff Engineer at Datadog, and has been a member of the serverless community since early 2017. His work focuses on serverless and distributed system observability. AJ is an open source author and maintains several projects, which improve the serverless developer experience. He has also spoken at multiple conferences, including AWS re:Invent and AWS Summits, and frequently writes about serverless topics on his blog.

Danielle Heberling – Hillsboro, USA

Serverless Hero Danielle Heberling is a software engineer with a background that includes being a musician, teaching at a K-8 public school, and working in technical support. She’s passionate about building things that make the world a better place, whether that be through social change or a good laugh. When she’s not coding or talking about serverless, you can often find her reaching back to her teaching roots by mentoring folks from underrepresented groups that would like to make a career switch into tech.

Dominik Grzywaczewski – Lublin, Poland

Community Hero Dominik Grzywaczewski is a Senior Cloud Site Reliability Engineer at Chaos Gears with more than 15 years of experience in IT. His primary objective is to assist companies in gaining a deeper understanding of Cloud Computing technologies, and effectively leveraging them to drive faster and more secure innovation. Dominik shares his passion by organizing technical meetups and workshops, and consistently collaborates with AWS community members. He also founded the AWS User Group in Lublin (Poland) and co-organizes the AWS Community Day conference in Warsaw (Poland).

Johannes Koch – Hessen, Germany

DevTools Hero Johannes Koch is a Sr. DevOps Engineer, Developer Experience, GTS at FICO where he contributes to the FICO®️ Platform. He shares his best practices related to Continuous Integration and Continuous Deployment (CI/CD) on his YouTube channel: cicdonaws. Johannes also founded the AWS User Group Bergstrasse, helped to start the AWS Community DACH Förderverein, and is part of the team that organizes the AWS Community Day in the DACH region.

Michael Walmsley – Melbourne, Australia

Serverless Hero Michael Walmsley is a Lead Technology Architect in the myWizard®️ Automation Group at Accenture, where he is focused on building event-driven products in the cloud. He is excited by the AWS Lambda Powertools open-source projects, and has been using and actively contributing to them since 2020. Michael is also a passionate AWS community member in Australia, supporting local meetups and conferences. He helps organize and run the AWS Programming and Tools Meetup in Melbourne, which focuses on running monthly hands-on training workshops that are open to everyone.

Mikey Fan – Beijing, China

Community Hero Mikey Fan is a Cloud-native Application Architect and SDN Developer. Since 2020, he has been actively exploring how to build innovative applications based on AWS EKS, Private 5G, and SD-WAN technology, and then applying them to 5G Edge Computing scenarios. Mikey is also a cloud-computing technology evangelist and an open-source enthusiast. He enjoys contributing code to open-source projects, such as Kubernetes and Tungsten Fabric, and he likes to demo how these open-source technologies can be combined with AWS cloud computing to create greater value.

Ran Isenberg – Kfar Saba, Israel

Serverless Hero Ran Isenberg is a principal software architect at CyberArk, where he designs and builds serverless services. He is passionate about CI/CD and AWS CDK, and has contributed several utilities to the AWS Lambda Powertools open-source project. Ran also maintains numerous serverless related open-source projects on his GitHub account, such as the AWS Lambda cookbook – a serverless service template that gets you started in the serverless world with all of the best practices in seconds.

Sabiha Ali – Dubai, United Arab Emirates

Community Hero Sabiha Ali is a Solutions Architect at ScaleCapacity. She specializes in Amazon Connect, architecting resilient and secure systems in the cloud. As an Amazon Connect Ambassador, she helps businesses enhance their customer experiences. Her unwavering passion for learning has earned her numerous AWS certifications (9X), solidifying her expertise in the field. She became an AWS User Group Leader in Dubai after starting out as an active AWS Community Builder. Sabiha is also committed to empowering women in the tech industry, making her a valued professional and an advocate for change.

Tomasz Dudek – Wroclaw, Poland

Machine Learning Hero Tomasz Dudek works as a Data & AI Team Lead and a Solutions Architect at Chaos Gears. He guides customers on how leveraging machine learning powered solutions can help their businesses thrive. He also designs AWS architectures and manages a data-focused team. Additionally, Tomasz co-organizes the AWS Community Day Poland, and as well as hosts the AWS User Group in his hometown Wroclaw. He often conducts workshops, such as SageMaker Immersion Days, speaks at conferences, and shares his knowledge in the form of short posts on LinkedIn, and longer ones on his blog, ‘MLOps and how you tame it.’

Wojciech Dąbrowski – Katowice, Poland

Community Hero Wojciech Dąbrowski is Head of Cloud Architecture at DTiQ, where he leads the team responsible for the architecture of cloud solutions and the cloud adaptation strategy in the organization. He has been an AWS User Group Silesia leader since 2019, and has managed to organize multiple online and offline meetups. In addition, Wojciech leads workshops and presents cloud computing and software engineering topics at various events.

Learn More

If you’d like to learn more about the new Heroes or connect with a Hero near you, please visit the AWS Heroes website or browse the AWS Heroes Content Library.

Taylor

A Guide to Maintaining a Healthy Email Database

Post Syndicated from nnatri original https://aws.amazon.com/blogs/messaging-and-targeting/guide-to-maintaining-healthy-email-database/

Introduction

In the digital age, email remains a powerful tool for businesses to communicate with their customers. Whether it’s for marketing campaigns, customer service updates, or important announcements, a well-maintained email database is crucial for ensuring that your messages reach their intended recipients. However, managing an email database is not just about storing email addresses. It involves keeping the database healthy, which means it’s up-to-date, accurate, and filled with engaged subscribers.

Amazon Simple Email Service (SES) offers robust features that help businesses manage their email environments effectively. Trusted by customers such as Amazon.com, Netflix, Duolingo and Reddit, SES helps customers deliver high-volume email campaigns of hundreds of billions of emails per year. Introduced in 2020, the list and subscription management feature of Amazon SES has added a new dimension to email database management, thereby reducing effort and time-to-value of managing a subscription list by allowing you to manage your list of contacts via its REST API, SDK or AWS CLI.

In this blog post, we will delve into the world of email database management in Amazon SES. You will explore two ways to manage your email database: building out your own email database functionality and using the built-in list and subscription management service. You will also learn the pros and cons of each approach and provide examples of customer use cases that would benefit from each approach. Regardless of the approach you ultimately decide to take, the blog will also share updated strategies for email database management to help with improving deliverability and customer engagement.

This guide is designed to help you navigate the complexities of email database management and make informed decisions that best suit your business needs. So, whether you’re new to Amazon SES or looking to optimize your existing email database management practices, this guide is for you. Let’s get started!

Email Database Management in Amazon SES

Amazon Simple Email Service (SES) offers two primary ways to manage your email database: building out your own email database functionality and using the built-in list and subscription management service. Each approach has its own set of advantages and potential drawbacks, and the best choice depends on your specific use case and business needs.

Building Out Your Email Database Functionality

When you choose to build out your own email database functionality, you have the flexibility to customize the database to suit your specific needs and leverage SES’ scalability as an email channel to send email at high volumes to your customer. Depending on the business requirement, the customizations could involve creating custom fields for subscriber data, implementing complex logic for categorizing and segmenting users, or integrating with other systems in your tech stack.

Using the Built-in List and Subscription Management Service

Alternatively, you can look at Amazon SES’s built-in list and subscription management service, which offers a ready-made solution for managing your email database. It handles tasks such as managing subscriptions to different topics and maintaining your customer email database through contact lists. Additionally, you can insert up to two links per email to the subscription preference page, which allow users to manage their topic preferences within Amazon SES.

SubscriptionPage

The non-configurable subscription page will automatically populate the customer’s current subscribed topic and allow setting of granular topic’s preferences. More information on how to configure that can be found here.

The following table should serve as a guideline to help you with deciding your approach for Email Database Management.
Building Your Own Email Database Functionality Using Built-in List and Subscription Management Service
Pros

Customization: Full control over the database structure and functionality, allowing for tailoring to specific needs. This includes creating custom fields for subscriber data, implementing own algorithms for handling bounces and complaints, and integrating with other systems in the tech stack.

Integration: Flexible flow of data across the business due to the ability to integrate the email database with other systems in the tech stack. You’ve already built your own email database or have one in mind which supports querying, building that database external to Amazon SES would make for a more customizable implementation.

Data Ownership: When you manage your own database, you have full ownership and control over your data. This can be important for businesses with strict data governance or regulatory requirements.

Ease of Use: The built-in service provides readily-available API to create, update and delete contacts. These operations are also available via REST API, AWS CLI and SDK. Once you’ve set up the subscription topics and contact lists, you can leverage the preference center to allow your customers to easily sub/unsubscribe from different topics.

Cost-Effective: More cost-effective than building own functionality as it requires less time and resources. The built-in service is also available free of charge unlike building out own infrastructure which would require ongoing infrastructure service costs.

Cons

Time and Resources: Building your own email database functionality requires a significant investment of time and resources. This includes the initial setup of the database, designing the schema, setting up the servers, and configuring the database software. Additionally, you’ll need to develop the functionality for managing subscriptions, and database cleanup in upon receiving bounces and complaints. Databases require ongoing maintenance to ensure they remain operational and efficient. This includes tasks like updating the database software, managing backups, optimizing queries, and scaling the database as your subscriber base grows.

Complexity: As your subscriber base grows, managing your own email database can become increasingly complex. You’ll need to handle more data, which can slow down queries and make the database more difficult to manage. You’ll also need to deal with more complex issues like data integrity, redundancy, and normalization. Additionally, as you add more features to your email database functionality, the codebase can become more complex, making it harder to maintain and debug.

Security: When you manage your own email database, you’re responsible for its security. This includes protecting the data from unauthorized access, ensuring the confidentiality of your subscribers’ information, and complying with data protection regulations. You’ll need to implement security measures like encryption, access controls, and regular security audits. If your database is compromised, it could lead to data loss or a breach of your subscribers’ privacy, which could damage your reputation and potentially lead to legal consequences.

Limited Customization: The built-in service may not offer the same level of customization as building own functionality. It may not meet all needs if there are specific requirements. For example, the preference center management page cannot be customized.

Dependence: Using the built-in service means you’re reliant on Amazon SES for your email database management. If the service experiences downtime or issues, it could impact your ability to manage your email database. This could potentially disrupt your email campaigns and affect your relationship with your subscribers. Furthermore, if you decide to switch to a different email service provider in the future, migrating your email database from the built-in service could be a complex and time-consuming process. Additionally, if your email database needs to be accessed or manipulated by other systems in your tech stack, this dependency on Amazon SES could complicate the integration process and limit your flexibility.

Customer Use Cases Best suited for businesses with specific needs that aren’t met by standard list management services, or those who wish to integrate their email database with other systems. For example, a large e-commerce company might choose to build out their own email database functionality to integrate with their customer relationship management (CRM) and inventory systems. Ideal for small to medium-sized businesses that need a straightforward, cost-effective solution for managing their email database. It’s also a good fit for businesses without the resources or technical expertise to build their own email database functionality.

Strategies for Email Database Management with Amazon Simple Email Service

Once you’ve made the decision on whether to manage your email database within Amazon SES or build your own, that’s only half of the equation. It’s important to recognize that your email databases will only work best to serve the business needs when you have processes in place to maintain them. In this section, let’s go through some of the best practices on how to do so.

  • Maintaining email list hygiene:
    • Both Amazon SES and a custom-built email database require maintaining a healthy email list. This involves regularly cleaning your list to remove invalid email addresses, hard bounces, and unengaged subscribers. With Amazon SES, the process to handle hard bounces and complaints is automated.
    • With a custom-built email database, you have more control over how and when this cleaning occurs. Rather than focusing on only email addresses that either hard bounces or complained, you can remove unengaged users. Every business will have their own definition of an un-engaged users based on business needs. Regardless, you will need to store the engagement attribute (e.g. days since last interaction). This will be simpler to architect in an external database which supports querying and bulk modification.
  • Managing Subscriptions:
    • With Amazon SES, you can easily manage subscriptions using the built-in functionality. This includes adding new subscribers, removing unsubscribed users, and updating user topic preferences. However, you will not be able to customize the look-and-feel of your subscription preference pages.
    • If you build your own email database, you’ll need to create your own system for managing subscriptions, which could require significant time and resources. The trade-off is that you can fully customize your subscription management system to showcase your branding on the subscription preference page and also handle custom logic for subscription/unsubscription.
  • Encouraging Engagement: Low engagement rates can indicate that your recipients are not interested in your content. To stimulate action, you can include a survey in the email, ask for feedback, or run a giveaway. You can then filter out inactive subscribers who still aren’t interacting with your emails. For engaged subscribers, you can segment these audiences into sub-groups by preference and send tailored email marketing campaigns. Before removing less active subscribers, consider what other kinds of content you could provide that might be more appealing. Unengaged subscribers can sometimes be re-engaged with the right offer, such as a free gift, a special perk, or exclusive content.
  • Renewing Opt-In: For your disengaged subscribers, send a re-optin campaign and remove them if they don’t re-subscribe. Be transparent! Notify inactive subscribers that you’ve noticed their lack of engagement and let them know that you don’t want to clutter their inbox if they’re not interested. Ask them if they want to continue to receive emails with a clear call-to-action button that will re-sign them up for future emails.
  • Making It Easy to Unsubscribe: Including an easy-to-find unsubscribe button and a one-step opt-out process won’t encourage subscribers to leave if you’re giving them a reason to stay. If recipients feel like they can’t leave, they’ll just mark your emails as spam, which counts as a big strike against your sender reputation.

Remember, effective email database management is a continuous process that requires regular attention and maintenance. By following these best practices, you can maximize the effectiveness of your email marketing efforts and build strong relationships with your subscribers.

Conclusion

In conclusion, maintaining a healthy email database is a critical aspect of successful email marketing. Whether you choose to build out your own email database functionality or use Amazon SES’s built-in list and subscription management service, it’s important to understand the pros and cons of each approach and align your decision with your business needs.

Building your own email database functionality offers the advantage of customization and integration with other systems in your tech stack. However, it requires significant time, resources, and technical expertise. On the other hand, Amazon SES’s built-in service is easy to use, cost-effective, and handles many complexities of email database management, but it may not offer the same level of customization.

Regardless of the approach you choose, following best practices for email database management is essential. This includes handling bounces and complaints, managing subscriptions, encouraging engagement, sending re-engagement email campaigns, renewing opt-ins, and making it easy to unsubscribe.

These practices will help you maintain a healthy email list, improve engagement rates, and ultimately, enhance the effectiveness of your email marketing efforts.It’s important to stay updated with the latest trends and strategies in email database management. So, keep exploring, learning, and implementing the best practices that suit your business needs.

For more information on Amazon SES and its features, visit the Amazon SES Documentation. Here, you’ll find comprehensive guides, tutorials, and API references to help you make the most of Amazon SES.

Introducing the Enhanced Document API for DynamoDB in the AWS SDK for Java 2.x

Post Syndicated from John Viegas original https://aws.amazon.com/blogs/devops/introducing-the-enhanced-document-api-for-dynamodb-in-the-aws-sdk-for-java-2-x/

We are excited to announce that the AWS SDK for Java 2.x now offers the Enhanced Document API for DynamoDB, providing an enhanced way of working with Amazon DynamoDb items.
This post covers using the Enhanced Document API for DynamoDB with the DynamoDB Enhanced Client. By using the Enhanced Document API, you can create an EnhancedDocument instance to represent an item with no fixed schema, and then use the DynamoDB Enhanced Client to read and write to DynamoDB.
Furthermore, unlike the Document APIs of aws-sdk-java 1.x, which provided arguments and return types that were not type-safe, the EnhancedDocument provides strongly-typed APIs for working with documents. This interface simplifies the development process and ensures that the data is correctly typed.

Prerequisites:

Before getting started, ensure you are using an up-to-date version of the AWS Java SDK dependency with all the latest released bug-fixes and features. For Enhanced Document API support, you must use version 2.20.33 or later. See our “Set up an Apache Maven project” guide for details on how to manage the AWS Java SDK dependency in your project.

Add dependency for dynamodb-enhanced in pom.xml.

<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>dynamodb-enhanced</artifactId>
<version>2.20.33</version>
</dependency>

Quick walk-through for using Enhanced Document API to interact with DDB

Step 1 : Create a DynamoDB Enhanced Client

Create an instance of the DynamoDbEnhancedClient class, which provides a high-level interface for Amazon DynamoDB that simplifies working with DynamoDB tables.

DynamoDbEnhancedClient enhancedClient = DynamoDbEnhancedClient.builder()
                                               .dynamoDbClient(DynamoDbClient.create())
                                               .build();

Step 2 : Create a DynamoDbTable resource object with Document table schema

To execute commands against a DynamoDB table using the Enhanced Document API, you must associate the table with your Document table schema to create a DynamoDbTable resource object. The Document table schema builder requires the primary index key and attribute converter providers. Use AttributeConverterProvider.defaultProvider() to convert document attributes of default types. An optional secondary index key can be added to the builder.


DynamoDbTable<EnhancedDocument> documentTable = enhancedClient.table("my_table",
                                              TableSchema.documentSchemaBuilder()
                                                         .addIndexPartitionKey(TableMetadata.primaryIndexName(),"hashKey", AttributeValueType.S)
                                                         .addIndexSortKey(TableMetadata.primaryIndexName(), "sortKey", AttributeValueType.N)
                                                         .attributeConverterProviders(AttributeConverterProvider.defaultProvider())
                                                         .build());
                                                         
// call documentTable.createTable() if "my_table" does not exist in DynamoDB

Step 3 : Write a DynamoDB item using an EnhancedDocument

The EnhancedDocument class has static factory methods along with a builder method to add attributes to a document. The following snippet demonstrates the type safety provided by EnhancedDocument when you construct a document item.

EnhancedDocument simpleDoc = EnhancedDocument.builder()
 .attributeConverterProviders(defaultProvider())
 .putString("hashKey", "sampleHash")
 .putNull("nullKey")
 .putNumber("sortKey", 1.0)
 .putBytes("byte", SdkBytes.fromUtf8String("a"))
 .putBoolean("booleanKey", true)
 .build();
 
documentTable.putItem(simpleDoc);

Step 4 : Read a Dynamo DB item as an EnhancedDocument

Attributes of the Documents retrieved from a DynamoDB table can be accessed with getter methods

EnhancedDocument docGetItem = documentTable.getItem(r -> r.key(k -> k.partitionValue("samppleHash").sortValue(1)));

docGetItem.getString("hashKey");
docGetItem.isNull("nullKey")
docGetItem.getNumber("sortKey").floatValue();
docGetItem.getBytes("byte");
docGetItem.getBoolean("booleanKey"); 

AttributeConverterProviders for accessing document attributes as custom objects

You can provide a custom AttributeConverterProvider instance to an EnhancedDocument to convert document attributes to a specific object type.
These providers can be set on either DocumentTableSchema or EnhancedDocument to read or write attributes as custom objects.

TableSchema.documentSchemaBuilder()
           .attributeConverterProviders(CustomClassConverterProvider.create(), defaultProvider())
           .build();
    
// Insert a custom class instance into an EnhancedDocument as attribute 'customMapOfAttribute'.
EnhancedDocument customAttributeDocument =
EnhancedDocument.builder().put("customMapOfAttribute", customClassInstance, CustomClass.class).build();

// Retrieve attribute 'customMapOfAttribute' as CustomClass object.
CustomClass customClassObject = customAttributeDocument.get("customMapOfAttribute", CustomClass.class);

Convert Documents to JSON and vice-versa

The Enhanced Document API allows you to convert a JSON string to an EnhancedDocument and vice-versa.

// Enhanced document created from JSON string using defaultConverterProviders.
EnhancedDocument documentFromJson = EnhancedDocument.fromJson("{\"key\": \"Value\"}")
                                              
// Converting an EnhancedDocument to JSON string "{\"key\": \"Value\"}"                                                 
String jsonFromDocument = documentFromJson.toJson();

Define a Custom Attribute Converter Provider

Custom attribute converter providers are implementations of AttributeConverterProvider that provide converters for custom classes.
Below is an example for a CustomClassForDocumentAPI which has as a single field stringAttribute of type String and its corresponding AttributeConverterProvider implementation.

public class CustomClassForDocumentAPI {
    private final String stringAttribute;

    public CustomClassForDocumentAPI(Builder builder) {
        this.stringAttribute = builder.stringAttribute;
    }
    public static Builder builder() {
        return new Builder();
    }
    public String stringAttribute() {
        return stringAttribute;
    }
    public static final class Builder {
        private String stringAttribute;
        private Builder() {
        }
        public Builder stringAttribute(String stringAttribute) {
            this.stringAttribute = string;
            return this;
        }
        public CustomClassForDocumentAPI build() {
            return new CustomClassForDocumentAPI(this);
        }
    }
}
import java.util.Map;
import software.amazon.awssdk.enhanced.dynamodb.AttributeConverter;
import software.amazon.awssdk.enhanced.dynamodb.AttributeConverterProvider;
import software.amazon.awssdk.enhanced.dynamodb.EnhancedType;
import software.amazon.awssdk.utils.ImmutableMap;

public class CustomAttributeForDocumentConverterProvider implements AttributeConverterProvider {
    private final Map<EnhancedType<?>, AttributeConverter<?>> converterCache = ImmutableMap.of(
        EnhancedType.of(CustomClassForDocumentAPI.class), new CustomClassForDocumentAttributeConverter());
        // Different types of converters can be added to this map.

    public static CustomAttributeForDocumentConverterProvider create() {
        return new CustomAttributeForDocumentConverterProvider();
    }

    @Override
    public <T> AttributeConverter<T> converterFor(EnhancedType<T> enhancedType) {
        return (AttributeConverter<T>) converterCache.get(enhancedType);
    }
}

A custom attribute converter is an implementation of AttributeConverter that converts a custom classes to and from a map of attribute values, as shown below.

import java.util.LinkedHashMap;
import java.util.Map;
import software.amazon.awssdk.enhanced.dynamodb.AttributeConverter;
import software.amazon.awssdk.enhanced.dynamodb.AttributeValueType;
import software.amazon.awssdk.enhanced.dynamodb.EnhancedType;
import software.amazon.awssdk.enhanced.dynamodb.internal.converter.attribute.EnhancedAttributeValue;
import software.amazon.awssdk.enhanced.dynamodb.internal.converter.attribute.StringAttributeConverter;
import software.amazon.awssdk.services.dynamodb.model.AttributeValue;

public class CustomClassForDocumentAttributeConverter implements AttributeConverter<CustomClassForDocumentAPI> {
    public static CustomClassForDocumentAttributeConverter create() {
        return new CustomClassForDocumentAttributeConverter();
    }
    @Override
    public AttributeValue transformFrom(CustomClassForDocumentAPI input) {
        Map<String, AttributeValue> attributeValueMap = new LinkedHashMap<>();
        if(input.string() != null){
            attributeValueMap.put("stringAttribute", AttributeValue.fromS(input.string()));
        }
        return EnhancedAttributeValue.fromMap(attributeValueMap).toAttributeValue();
    }

    @Override
    public CustomClassForDocumentAPI transformTo(AttributeValue input) {
        Map<String, AttributeValue> customAttr = input.m();
        CustomClassForDocumentAPI.Builder builder = CustomClassForDocumentAPI.builder();
        if (customAttr.get("stringAttribute") != null) {
            builder.stringAttribute(StringAttributeConverter.create().transformTo(customAttr.get("stringAttribute")));
        }
        return builder.build();
    }
    @Override
    public EnhancedType<CustomClassForDocumentAPI> type() {
        return EnhancedType.of(CustomClassForDocumentAPI.class);
    }
    @Override
    public AttributeValueType attributeValueType() {
        return AttributeValueType.M;
    }
}

Attribute Converter Provider for EnhancedDocument Builder

When working outside of a DynamoDB table context, make sure to set the attribute converter providers explicitly on the EnhancedDocument builder. When used within a DynamoDB table context, the table schema’s converter provider will be used automatically for the EnhancedDocument.
The code snippet below shows how to set an AttributeConverterProvider using the EnhancedDocument builder method.

// Enhanced document created from JSON string using custom AttributeConverterProvider.
EnhancedDocument documentFromJson = EnhancedDocument.builder()
                                                    .attributeConverterProviders(CustomClassConverterProvider.create())
                                                    .json("{\"key\": \"Values\"}")
                                                    .build();
                                                    
CustomClassForDocumentAPI customClass = documentFromJson.get("key", CustomClassForDocumentAPI.class)

Conclusion

In this blog post we showed you how to set up and begin using the Enhanced Document API with the DynamoDB Enhanced Client and standalone with the EnhancedDocument class. The enhanced client is open-source and resides in the same repository as the AWS SDK for Java 2.0.
We hope you’ll find this new feature useful. You can always share your feedback on our GitHub issues page.

Meet the Newest AWS Heroes – March 2023

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/meet-the-newest-aws-heroes-march-2023/

The AWS Heroes are passionate AWS experts who are dedicated to sharing their in-depth knowledge within the community. They inspire, uplift, and motivate the global AWS community, and today, we’re excited to announce and recognize the newest Heroes in 2023!

Aidan Steele – Melbourne, Australia

Serverless Hero Aidan Steele is a Senior Engineer at Nightvision. He is an avid AWS user, and has been using the first platform and EC2 since 2008. Fifteen years later, EC2 still has a special place in his heart, but his interests are in containers and serverless functions, and blurring the distinction between them wherever possible. He enjoys finding novel uses for AWS services, especially when they have a security or network focus. This is best demonstrated through his open source contributions on GitHub, where he shares interesting use cases via hands-on projects.

Ananda Dwi Rahmawati – Yogyakarta, Indonesia

Container Hero Ananda Dwi Rahmawati is a Sr. Cloud Infrastructure Engineer, specializing in system integration between cloud infrastructure, CI/CD workflows, and application modernization. She implements solutions using powerful services provided by AWS, such as Amazon Elastic Kubernetes Service (EKS), combined with open source tools to achieve the goal of creating reliable, highly available, and scalable systems. She is a regular technical speaker who delivers presentations using real-world case studies at several local community meetups and conferences, such as Kubernetes and OpenInfra Days Indonesia, AWS Community Day Indonesia, AWS Summit ASEAN, and many more.

Wendy Wong – Sydney, Australia

Data Hero Wendy Wong is a Business Performance Analyst at Service NSW, building data pipelines with AWS Analytics and agile projects in AI. As a teacher at heart, she enjoys sharing her passion as a Data Analytics Lead Instructor for General Assembly Sydney, writing technical blogs on dev.to. She is both an active speaker for AWS analytics and an advocate of diversity and inclusion, presenting at a number of events: AWS User Group Malaysia, Women Who Code, AWS Summit Australia 2022, AWS BuildHers, AWS Innovate Modern Applications, and many more.

Learn More

If you’d like to learn more about the new Heroes or connect with a Hero near you, please visit the AWS Heroes website or browse the AWS Heroes Content Library.

Taylor

Enable federation to Amazon QuickSight accounts with Ping One

Post Syndicated from Srikanth Baheti original https://aws.amazon.com/blogs/big-data/enable-federation-to-amazon-quicksight-accounts-with-ping-one/

Amazon QuickSight is a scalable, serverless, embeddable, machine learning (ML)-powered business intelligence (BI) service built for the cloud that supports identity federation in both Standard and Enterprise editions. Organizations are working towards centralizing their identity and access strategy across all of their applications, including on-premises, third-party, and applications on AWS. Many organizations use Ping One to control and manage user authentication and authorization centrally. If your organization uses Ping One for cloud applications, you can enable federation to all of your QuickSight accounts without needing to create and manage users in QuickSight. This authorizes users to access QuickSight assets—analyses, dashboards, folders, and datasets—through centrally managed Ping One.

In this post, we go through the steps to configure federated single sign-on (SSO) between a Ping One instance and a QuickSight account. We demonstrate registering an SSO application in Ping One, creating groups, and mapping to an AWS Identity and Access Management (IAM) role that translates to QuickSight user license types (admin, author, and reader). These QuickSight roles represent three different personas supported in QuickSight. Administrators can publish the QuickSight app in Ping One to enable users to perform SSO to QuickSight using their Ping credentials.

Prerequisites

To complete this walkthrough, you must have the following prerequisites:

  • A Ping One subscription
  • One or more QuickSight account subscriptions

Solution overview

The walkthrough includes the following steps:

  1. Create groups in Ping One for each of the QuickSight user license types.
  2. Register an AWS application in Ping One.
  3. Add Ping One as your SAML identity provider (IdP) in AWS.
  4. Configure an IAM policy.
  5. Configure an IAM role.
  6. Configure your AWS application in Ping One.
  7. Test the application from Ping One.

Create groups in Ping One for each of the QuickSight roles

To create groups in Ping One, complete the following steps:

  1. Sign in to the Ping One portal using an administrator account.
  2. Under Identities, choose Groups.
  3. Choose the plus sign to add a group.
    BDB-2210-Ping-Groups
  4. For Group Name, enter QuickSightReaders.
  5. Choose Save.
    BDB-2210-Ping-Groups-Save
  6. Repeat these steps to create the groups QuickSightAdmins and QuickSightAuthors.

Register an AWS application in Ping One

To configure the integration of an AWS application in Ping One, you need to add AWS to your list of managed software as a service (SaaS) apps.

  1. Sign in to the Ping One portal using an administrator account.
  2. Under Connections, choose Application Catalog.
  3. In the search box, enter amazon web services.
  4. Choose Amazon Web Services – AWS from the results to add the application.  BDB-2210-Ping-AWS-APP
  5. For Name, enter Amazon QuickSight.
  6. Choose Next.
    BDB-2210-Ping-AWS-SAVEUnder Map Attributes, there should be four attributes.
  7. Delete the attribute related to SessionDuration.
  8. Choose Username as the value for all the remaining attributes for now.
    We update these values in later steps.
  9. Choose Next.
    BDB-2210-Ping-AWS-Attributes
  10. In the Select Groups section, add the QuickSightAdmins, QuickSightAuthors, and QuickSightReaders groups you created.
  11. Choose Save.
    BDB-2210-Ping-AWS-Attributes-Save
  12. After the application is created, choose the application again and download the federation metadata XML.

You use this in the next step.
BDB-2210-Ping-AWS-Metadata

Add Ping One as your SAML IdP in AWS

To configure Ping One as your SAML IdP, complete the following steps:

  1. Open a new tab in your browser.
  2. Sign in to the IAM console in your AWS account with admin permissions.
  3. On the IAM console, under Access Management in the navigation pane, choose Identity providers.
  4. Choose Add provider.
    BDB-2210-Ping-AWS-IAM
  5. For Provider name, enter PingOne.
  6. Choose file to upload the metadata document you downloaded earlier.
  7. Choose Add provider.
  8. In the banner message that appears, choose View provider.
  9. Copy the IdP ARN to use in a later step.
    BDB-2210-Ping-AWS-IAM_ARN

Configure an IAM policy

In this step, you create an IAM policy to map three different roles with permissions in QuickSight.

Use the following steps to set up QuickSightUserCreationPolicy. This policy grants privileges in QuickSight to the federated user based on the assigned groups in Ping One.

  1. On the IAM console, choose Policies.
  2. Choose Create policy.
  3. On the JSON tab, replace the existing text with the following code:
    {
       "Version": "2012-10-17",
        "Statement": [ 
             {  
                "Sid": "VisualEditor0", 
                 "Effect": "Allow", 
                 "Action": "quicksight:CreateAdmin", 
                 "Resource": "*", 
                 "Condition": { 
                     "StringEquals": { 
                         "aws:PrincipalTag/user-role": "QuickSightAdmins" 
     
                    } 
                 } 
             }, 
             { 
                 "Sid": "VisualEditor1", 
                 "Effect": "Allow", 
                 "Action": "quicksight:CreateUser", 
                 "Resource": "*", 
                 "Condition": { 
                     "StringEquals": { 
                         "aws:PrincipalTag/user-role": "QuickSightAuthors" 
                     } 
                 } 
             }, 
             { 
                 "Sid": "VisualEditor2", 
                 "Effect": "Allow", 
                 "Action": "quicksight:CreateReader", 
                 "Resource": "*", 
                 "Condition": { 
                     "StringEquals": { 
                         "aws:PrincipalTag/user-role": "QuickSightReaders" 
                     } 
                 } 
             } 
         ] 
     } 
  4. Choose Review policy.
    BDB-2210-AWS-IAM-Policy
  5. For Name, enter QuickSightUserCreationPolicy.
    BDB-2210-AWS-IAM-Policy-Save
  6. Choose Create policy.

Configure an IAM role

Next, create the role that Ping One users assume when federating into QuickSight. Use the following steps to set up the federated role:

  1. On the IAM console, choose Roles.
  2. Choose Create role.
  3. For Trusted entity type, select SAML 2.0 federation.
  4. For SAML 2.0-based provider, choose the provider you created earlier (PingOne).
  5. Select Allow programmatic and AWS Management Console access.
  6. For Attribute, choose SAML:aud.
  7. For Value, enter https://signin.aws.amazon.com/saml.
  8. Choose Next.
    BDB-2210-Ping-IAM-Role
  9. Under Permissions policies, select the QuickSightUserCreationPolicy IAM policy you created in the previous step.
  10. Choose Next.
    BDB-2210-Ping-IAM-Role_Permissions
  11. For Role name, enter QSPingOneFederationRole.
    DBD-2210-PingOne-IAM-Role-Name
  12. Choose Create role.
  13. On the IAM console, in the navigation pane, choose Roles.
  14. Choose the QSPingOneFederationRole role you created to open the role’s properties.
  15. Copy the role ARN to use in later steps.
  16. On the Trust relationships tab, under Trusted entities, verify that the IdP you created is listed.
  17. Under Condition in the policy code, verify that SAML:aud with a value of https://signin.aws.amazon.com/saml is present.
  18. Choose Edit trust policy to add an additional condition.
    DBD-2210-PingOne-IAM-TrustPolicy
  19. Under Condition, add the following code:
    "StringLike": {
    "aws:RequestTag/user-role": "*"
    }

  20. Under Action, add the following code:
      "sts:TagSession"

    BDB-2210-PingOne-Role-Save

  21. Choose Update policy to save changes.

Configure an AWS application in Ping One

To configure your AWS application, complete the following steps:

  1. Sign in to the Ping One portal using a Ping One administrator account.
  2. Under Connections, choose Application.
  3. Choose the Amazon QuickSight application you created earlier.
  4. On the Profile tab, choose Enable Advanced ConfigurationBDB-2210-Ping-AdvancedConfig
  5. Choose Enable in the pop-up window.
    BDB-2210-Ping-AdvancedConfig1
  6. On the Configuration tab, choose the pencil icon to edit the configuration.
    BDB-2210-Ping-AdvancedConfig2
  7. Under SIGNING KEY, select Sign Assertion & Response.
    BDB-2210-Ping-AdvancedConfig4
  8. Under SLO BINDING, for Assertion Validity Duration In Seconds, enter a duration, such as 900.
  9. For Target Application URL, enter https://quicksight.aws.amazon.com/.
  10. Choose Save.
    BDB-2210-Ping-AdvancedConfig5On the Attribute Mappings tab, you now add or update the attributes as in the following table.
Attribute Name Value
saml_subject Username
https://aws.amazon.com/SAML/Attributes/RoleSessionName Username
https://aws.amazon.com/SAML/Attributes/Role ‘arn:aws:iam::xxxxxxxxxx:role/QSPingOneFederationRole,
arn:aws:iam::xxxxxxxxxx:saml-provider/PingOne’
https://aws.amazon.com/SAML/Attributes/PrincipalTag:user-role user.memberOfGroupNames[0]
  1. Enter https://aws.amazon.com/SAML/Attributes/PrincipalTag:user-role for the attribute name and use the corresponding value from the table for the expression.
  2. Choose Save.
  3. If you have more than one QuickSight user role (for this post, QuickSightAdmins, QuicksightAuthors, and QuickSightReaders), you can add all the appropriate role names as follows:
    #data.containsAny(user.memberOfGroupNames,{'QuickSightAdmins'})? 'QuickSightAdmins' : 
    
    #data.containsAny(user.memberOfGroupNames,{'QuickSightAuthorss'}) ? 'QuickSightAuthors' : 
    
    #data.containsAny(user.memberOfGroupNames,{'QuickSightReaders'}) ?'QuickSightReaders' : null

  4. To edit the role attribute, choose the gear icon next to the role.
  5. Populate the corresponding expression from the table and choose Save.

The format of the expression is the role ARN (copied in the role creation step) followed by the IdP ARN (copied in the IdP creation step) separated by a comma.

Test the application

In this section, you test your Ping One SSO configuration by using a Microsoft application.

  1. In the Ping One portal, under Identities, choose Groups.
  2. Choose a group and choose Add Users Individually.
  3. From the list of users, add the appropriate users to the group by choosing the plus sign.
  4. Choose Save.
  5. To test the connectivity, under Environment, choose Properties, then copy the URL under APPLICATION PORTAL URL.
  6. Browse to the URL in a private browsing window.
  7. Enter your user credentials and choose Sign On.
    Upon a successful sign-in, you’re redirected to the All Applications page with a new application called Amazon QuickSight.
  8. Choose the Amazon QuickSight application to be redirected to the QuickSight console.

Note in the following screenshot that the user name at the top of the page shows as the Ping One federated user.

Summary

This post provided step-by-step instructions to configure federated SSO between Ping One and the QuickSight console. We also discussed how to create policies and roles in IAM and map groups in Ping One to IAM roles for secure access to the QuickSight console.

For additional discussions and help getting answers to your questions, check out the QuickSight Community.


About the authors

Srikanth Baheti is a Specialized World Wide Sr. Solution Architect for Amazon QuickSight. He started his career as a consultant and worked for multiple private and government organizations. Later he worked for PerkinElmer Health and Sciences & eResearch Technology Inc, where he was responsible for designing and developing high traffic web applications, highly scalable and maintainable data pipelines for reporting platforms using AWS services and Serverless computing.

Raji Sivasubramaniam is a Sr. Solutions Architect at AWS, focusing on Analytics. Raji is specialized in architecting end-to-end Enterprise Data Management, Business Intelligence and Analytics solutions for Fortune 500 and Fortune 100 companies across the globe. She has in-depth experience in integrated healthcare data and analytics with wide variety of healthcare datasets including managed market, physician targeting and patient analytics.

Raj Jayaraman is a Senior Specialist Solutions Architect for Amazon QuickSight. Raj focuses on helping customers develop sample dashboards, embed analytics and adopt BI design patterns and best practices.

Introducing Amazon Simple Queue Service dead-letter queue redrive to source queues

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-amazon-simple-queue-service-dead-letter-queue-redrive-to-source-queues/

This blog post is written by Mark Richman, a Senior Solutions Architect for SMB.

Today AWS is launching a new capability to enhance the dead-letter queue (DLQ) management experience for Amazon Simple Queue Service (SQS). DLQ redrive to source queues allows SQS to manage the lifecycle of unconsumed messages stored in DLQs.

SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. Using Amazon SQS, you can send, store, and receive messages between software components at any volume without losing messages or requiring other services to be available.

To use SQS, a producer sends messages to an SQS queue, and a consumer pulls the messages from the queue. Sometimes, messages can’t be processed due to a number of possible issues. These can include logic errors in consumers that cause message processing to fail, network connectivity issues, or downstream service failures. This can result in unconsumed messages remaining in the queue.

Understanding SQS dead-letter queues (DLQs)

SQS allows you to manage the life cycle of the unconsumed messages using dead-letter queues (DLQs).

A DLQ is a separate SQS queue that one or many source queues can send messages that can’t be processed or consumed. DLQs allow you to debug your application by letting you isolate messages that can’t be processed correctly to determine why their processing didn’t succeed. Use a DLQ to handle message consumption failures gracefully.

When you create a source queue, you can specify a DLQ and the condition under which SQS moves messages from the source queue to the DLQ. This is called the redrive policy. The redrive policy condition specifies the maxReceiveCount. When a producer places messages on an SQS queue, the ReceiveCount tracks the number of times a consumer tries to process the message. When the ReceiveCount for a message exceeds the maxReceiveCount for a queue, SQS moves the message to the DLQ. The original message ID is retained.

For example, a source queue has a redrive policy with maxReceiveCount set to 5. If the consumer of the source queue receives a message 6, without successfully consuming it, SQS moves the message to the dead-letter queue.

You can configure an alarm to alert you when any messages are delivered to a DLQ. You can then examine logs for exceptions that might have caused them to be delivered to the DLQ. You can analyze the message contents to diagnose consumer application issues. Once the issue has been resolved and the consumer application recovers, these messages can be redriven from the DLQ back to the source queue to process them successfully.

Previously, this required dedicated operational cycles to review and redrive these messages back to their source queue.

DLQ redrive to source queues

DLQ redrive to source queues enables SQS to manage the second part of the lifecycle of unconsumed messages that are stored in DLQs. Once the consumer application is available to consume the failed messages, you can now redrive the messages from the DLQ back to the source queue. You can optionally review a sample of the available messages in the DLQ. You redrive the messages using the Amazon SQS console. This allows you to more easily recover from application failures.

Using redrive to source queues

To show how to use the new functionality there is an existing standard source SQS queue called MySourceQueue.

SQS does not create DLQs automatically. You must first create an SQS queue and then use it as a DLQ. The DLQ must be in the same region as the source queue.

Create DLQ

  1. Navigate to the SQS Management Console and create a standard SQS queue for the DLQ called MyDLQ. Use the default configuration. Refer to the SQS documentation for instructions on creating a queue.
  2. Navigate to MySourceQueue and choose Edit.
  3. Navigate to the Dead-letter queue section and choose Enabled.
  4. Select the Amazon Resource Name (ARN) of the MyDLQ queue you created previously.
  5. You can configure the number of times that a message can be received before being sent to a DLQ by setting Set Maximum receives to a value between 1 and 1,000. For this demo enter a value of 1 to immediately drive messages to the DLQ.
  6. Choose Save.
Configure source queue with DLQ

Configure source queue with DLQ

The console displays the Details page for the queue. Within the Dead-letter queue tab, you can see the Maximum receives value and DLQ ARN.

DLQ configuration

DLQ configuration

Send and receive test messages

You can send messages to test the functionality in the SQS console.

  1. Navigate to MySourceQueue and choose Send and receive messages
  2. Send a number of test messages by entering the message content in Message body and choosing Send message.
  3. Send and receive messages

    Send and receive messages

  4. Navigate to the Receive messages section where you can see the number of messages available.
  5. Choose Poll for messages. The Maximum message count is set to 10 by default If you sent more than 10 test messages, poll multiple times to receive all the messages.
Poll for messages

Poll for messages

All the received messages are sent to the DLQ because the maxReceiveCount is set to 1. At this stage you would normally review the messages. You would determine why their processing didn’t succeed and resolve the issue.

Redrive messages to source queue

Navigate to the list of all queues and filter if required to view the DLQ. The queue displays the approximate number of messages available in the DLQ. For standard queues, the result is approximate because of the distributed architecture of SQS. In most cases, the count should be close to the actual number of messages in the queue.

Messages available in DLQ

Messages available in DLQ

  1. Select the DLQ and choose Start DLQ redrive.
  2. DLQ redrive

    DLQ redrive

    SQS allows you to redrive messages either to their source queue(s) or to a custom destination queue.

  3. Choose to Redrive to source queue(s), which is the default.
  4. Redrive has two velocity control settings.

  • System optimized sends messages back to the source queue as fast as possible
  • Custom max velocity allows SQS to redrive messages with a custom maximum rate of messages per second. This feature is useful for minimizing the impact to normal processing of messages in the source queue.

You can optionally inspect messages prior to redrive.

  • To redrive the messages back to the source queue, choose DLQ redrive.
  • DLQ redrive

    DLQ redrive

    The Dead-letter queue redrive status panel shows the status of the redrive and percentage processed. You can refresh the display or cancel the redrive.

    Dead-letter queue redrive status

    Dead-letter queue redrive status

    Once the redrive is complete, which takes a few seconds in this example, the status reads Successfully completed.

    Redrive status completed

    Redrive status completed

  • Navigate back to the source queue and you can see all the messages are redriven back from the DLQ to the source queue.
  • Messages redriven from DLQ to source queue

    Messages redriven from DLQ to source queue

    Conclusion

    Dead-letter queue redrive to source queues allows you to effectively manage the life cycle of unconsumed messages stored in dead-letter queues. You can build applications with the confidence that you can easily examine unconsumed messages, recover from errors, and reprocess failed messages.

    You can redrive messages from their DLQs to their source queues using the Amazon SQS console.

    Dead-letter queue redrive to source queues is available in all commercial regions, and coming soon to GovCloud.

    To get started, visit https://aws.amazon.com/sqs/

    For more serverless learning resources, visit Serverless Land.

    Forwarding emails automatically based on content with Amazon Simple Email Service

    Post Syndicated from Murat Balkan original https://aws.amazon.com/blogs/messaging-and-targeting/forwarding-emails-automatically-based-on-content-with-amazon-simple-email-service/

    Introduction

    Email is one of the most popular channels consumers use to interact with support organizations. In its most basic form, consumers will send their email to a catch-all email address where it is further dispatched to the correct support group. Often, this requires a person to inspect content manually. Some IT organizations even have a dedicated support group that handles triaging the incoming emails before assigning them to specialized support teams. Triaging each email can be challenging, and delays in email routing and support processes can reduce customer satisfaction. By utilizing Amazon Simple Email Service’s deep integration with Amazon S3, AWS Lambda, and other AWS services, the task of categorizing and routing emails is automated. This automation results in increased operational efficiencies and reduced costs.

    This blog post shows you how a serverless application will receive emails with Amazon SES and deliver them to an Amazon S3 bucket. The application uses Amazon Comprehend to identify the dominant language from the message body.  It then looks it up in an Amazon DynamoDB table to find the support group’s email address specializing in the email subject. As the last step, it forwards the email via Amazon SES to its destination. Archiving incoming emails to Amazon S3 also enables further processing or auditing.

    Architecture

    By completing the steps in this post, you will create a system that uses the architecture illustrated in the following image:

    Architecture showing how to forward emails by content using Amazon SES

    The flow of events starts when a customer sends an email to the generic support email address like [email protected]. This email is listened to by Amazon SES via a recipient rule. As per the rule, incoming messages are written to a specified Amazon S3 bucket with a given prefix.

    This bucket and prefix are configured with S3 Events to trigger a Lambda function on object creation events. The Lambda function reads the email object, parses the contents, and sends them to Amazon Comprehend for language detection.

    Amazon DynamoDB looks up the detected language code from an Amazon DynamoDB table, which includes the mappings between language codes and support group email addresses for these languages. One support group could answer English emails, while another support group answers French emails. The Lambda function determines the destination address and re-sends the same email address by performing an email forward operation. Suppose the lookup does not return any destination address, or the language was not be detected. In that case, the email is forwarded to a catch-all email address specified during the application deployment.

    In this example, Amazon SES hosts the destination email addresses used for forwarding, but this is not a requirement. External email servers will also receive the forwarded emails.

    Prerequisites

    To use Amazon SES for receiving email messages, you need to verify a domain that you own. Refer to the documentation to verify your domain with Amazon SES console. If you do not have a domain name, you will register one from Amazon Route 53.

    Deploying the Sample Application

    Clone this GitHub repository to your local machine and install and configure AWS SAM with a test AWS Identity and Access Management (IAM) user.

    You will use AWS SAM to deploy the remaining parts of this serverless architecture.

    The AWS SAM template creates the following resources:

    • An Amazon DynamoDB mapping table (language-lookup) contains information about language codes and associates them with destination email addresses.
    • An AWS Lambda function (BlogEmailForwarder) that reads the email content parses it, detects the language, looks up the forwarding destination email address, and sends it.
    • An Amazon S3 bucket, which will store the incoming emails.
    • IAM roles and policies.

    To start the AWS SAM deployment, navigate to the root directory of the repository you downloaded and where the template.yaml AWS SAM template resides. AWS SAM also requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. You will refer to the documentation to learn how to create an Amazon S3 bucket. The bucket should have read and write access by an AWS Identity and Access Management (IAM) user.

    At the command line, enter the following command to package the application:

    sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE

    In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.

    AWS SAM packages the application and copies it into this Amazon S3 bucket.

    When the AWS SAM package command finishes running, enter the following command to deploy the package:

    sam deploy --template-file output_template.yaml --stack-name blogstack --capabilities CAPABILITY_IAM --parameter-overrides FromEmailAddress=info@ YOUR_DOMAIN_NAME_HERE CatchAllEmailAddress=catchall@ YOUR_DOMAIN_NAME_HERE

    In the preceding command, change the YOUR_DOMAIN_NAME_HERE with the domain name you validated with Amazon SES. This domain also applies to other commands and configurations that will be introduced later.

    This example uses “blogstack” as the stack name, you will change this to any other name you want. When you run this command, AWS SAM shows the progress of the deployment.

    Configure the Sample Application

    Now that you have deployed the application, you will configure it.

    Configuring Receipt Rules

    To deliver incoming messages to Amazon S3 bucket, you need to create a Rule Set and a Receipt rule under it.

    Note: This blog uses Amazon SES console to create the rule sets. To create the rule sets with AWS CloudFormation, refer to the documentation.

    1. Navigate to the Amazon SES console. From the left navigation choose Rule Sets.
    2. Choose Create a Receipt Rule button at the right pane.
    3. Add info@YOUR_DOMAIN_NAME_HERE as the first recipient addresses by entering it into the text box and choosing Add Recipient.

     

     

    Choose the Next Step button to move on to the next step.

    1. On the Actions page, select S3 from the Add action drop-down to reveal S3 action’s details. Select the S3 bucket that was created by the AWS SAM template. It is in the format of your_stack_name-inboxbucket-randomstring. You will find the exact name in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console. Set the Object key prefix to info/. This tells Amazon SES to add this prefix to all messages destined to this recipient address. This way, you will re-use the same bucket for different recipients.

    Choose the Next Step button to move on to the next step.

    In the Rule Details page, give this rule a name at the Rule name field. This example uses the name info-recipient-rule. Leave the rest of the fields with their default values.

    Choose the Next Step button to move on to the next step.

    1. Review your settings on the Review page and finalize rule creation by choosing Create Rule

    1. In this example, you will be hosting the destination email addresses in Amazon SES rather than forwarding the messages to an external email server. This way, you will be able to see the forwarded messages in your Amazon S3 bucket under different prefixes. To host the destination email addresses, you need to create different rules under the default rule set. Create three additional rules for catchall@YOUR_DOMAIN_NAME_HERE , english@ YOUR_DOMAIN_NAME_HERE and french@YOUR_DOMAIN_NAME_HERE email addresses by repeating the steps 2 to 5. For Amazon S3 prefixes, use catchall/, english/, and french/ respectively.

     

    Configuring Amazon DynamoDB Table

    To configure the Amazon DynamoDB table that is used by the sample application

    1. Navigate to Amazon DynamoDB console and reach the tables view. Inspect the table created by the AWS SAM application.

    language-lookup table is the table where languages and their support group mappings are kept. You need to create an item for each language, and an item that will hold the default destination email address that will be used in case no language match is found. Amazon Comprehend supports more than 60 different languages. You will visit the documentation for the supported languages and add their language codes to this lookup table to enhance this application.

    1. To start inserting items, choose the language-lookup table to open table overview page.
    2. Select the Items tab and choose the Create item From the dropdown, select Text. Add the following JSON content and choose Save to create your first mapping object. While adding the following object, replace Destination attribute’s value with an email address you own. The email messages will be forwarded to that address.

    {

      “language”: “en”,

      “destination”: “english@YOUR_DOMAIN_NAME_HERE”

    }

    Lastly, create an item for French language support.

    {

      “language”: “fr”,

      “destination”: “french@YOUR_DOMAIN_NAME_HERE”

    }

    Testing

    Now that the application is deployed and configured, you will test it.

    1. Use your favorite email client to send the following email to the domain name info@ email address.

    Subject: I need help

    Body:

    Hello, I’d like to return the shoes I bought from your online store. How can I do this?

    After the email is sent, navigate to the Amazon S3 console to inspect the contents of the Amazon S3 bucket that is backing the Amazon SES Rule Sets. You will also see the AWS Lambda logs from the Amazon CloudWatch console to confirm that the Lambda function is triggered and run successfully. You should receive an email with the same content at the address you defined for the English language.

    1. Next, send another email with the same content, this time in French language.

    Subject: j’ai besoin d’aide

    Body:

    Bonjour, je souhaite retourner les chaussures que j’ai achetées dans votre boutique en ligne. Comment puis-je faire ceci?

     

    Suppose a message is not matched to a language in the lookup table. In that case, the Lambda function will forward it to the catchall email address that you provided during the AWS SAM deployment.

    You will inspect the new email objects under english/, french/ and catchall/ prefixes to observe the forwarding behavior.

    Continue experimenting with the sample application by sending different email contents to info@ YOUR_DOMAIN_NAME_HERE address or adding other language codes and email address combinations into the mapping table. You will find the available languages and their codes in the documentation. When adding a new language support, don’t forget to associate a new email address and Amazon S3 bucket prefix by defining a new rule.

    Cleanup

    To clean up the resources you used in your account,

    1. Navigate to the Amazon S3 console and delete the inbox bucket’s contents. You will find the name of this bucket in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console.
    2. Navigate to AWS CloudFormation console and delete the stack named “blogstack”.
    3. After the stack is deleted, remove the domain from Amazon SES. To do this, navigate to the Amazon SES Console and choose Domains from the left navigation. Select the domain you want to remove and choose Remove button to remove it from Amazon SES.
    4. From the Amazon SES Console, navigate to the Rule Sets from the left navigation. On the Active Rule Set section, choose View Active Rule Set button and delete all the rules you have created, by selecting the rule and choosing Action, Delete.
    5. On the Rule Sets page choose Disable Active Rule Set button to disable listening for incoming email messages.
    6. On the Rule Sets page, Inactive Rule Sets section, delete the only rule set, by selecting the rule set and choosing Action, Delete.
    7. Navigate to CloudWatch console and from the left navigation choose Logs, Log groups. Find the log group that belongs to the BlogEmailForwarderFunction resource and delete it by selecting it and choosing Actions, Delete log group(s).
    8. You will also delete the Amazon S3 bucket you used for packaging and deploying the AWS SAM application.

     

    Conclusion

    This solution shows how to use Amazon SES to classify email messages by the dominant content language and forward them to respective support groups. You will use the same techniques to implement similar scenarios. You will forward emails based on custom key entities, like product codes, or you will remove PII information from emails before forwarding with Amazon Comprehend.

    With its native integrations with AWS services, Amazon SES allows you to enhance your email applications with different AWS Cloud capabilities easily.

    To learn more about email forwarding with Amazon SES, you will visit documentation and AWS blogs.

    Better performance for less: AWS continues to beat Azure on SQL Server price/performance

    Post Syndicated from Fred Wurden original https://aws.amazon.com/blogs/compute/sql-server-runs-better-on-aws/

    By Fred Wurden, General Manager, AWS Enterprise Engineering (Windows, VMware, RedHat, SAP, Benchmarking)

    AWS R5b.8xlarge delivers better performance at lower cost than Azure E64_32s_v4 for a SQL Server workload

    In this blog, we will review a recent benchmark that Principled Technologies published on 2/25. The benchmark found that an Amazon Elastic Compute Cloud (Amazon EC2) R5b.8xlarge instance delivered better performance for a SQL Server workload at a lower cost when directly tested against an Azure E64_32s_v4 VM.

    Behind the study: Understanding how SQL Server performed better, for a lower cost with an AWS EC2 R5b instance

    Principled Technologies tested an online transaction processing (OLTP) workload for SQL Server 2019 on both an R5b instance on Amazon EC2 with Amazon Elastic Block Store (EBS) as storage and Azure E64_32s_v4. This particular Azure VM was chosen as an equivalent to the R5b instance, as both instances have comparable specifications for input/output operations per second (IOPS) performance, use Intel Xeon processors from the same generation (Cascade Lake), and offer the same number of cores (32). For storage, Principled Technologies mirrored storage configurations across the Azure VM and the EC2 instance (which used Amazon Elastic Block Store (EBS)), maxing out the IOPS specs on each while offering a direct comparison between instances.

    Test Configurations

    Source: Principled Technologies

    When benchmarking, Principled Technologies ran a TPC-C-like OLTP workload from HammerDB v3.3 on both instances, testing against new orders per minute (NOPM) performance. NOPM shows the number of new-order transactions completed in one minute as part of a serialized business workload. HammerDB claims that because NOPM is “independent of any particular database implementation [it] is the recommended primary metric to use.”

    The results: SQL Server on AWS EC2 R5b delivered 2x performance than the Azure VM and 62% less expensive 

    Graphs that show AWS instance outperformed the Azure instance

    Source: Principled Technologies

    These test results from the Principled Technologies report show the price/performance and performance comparisons. The performance metric is New Orders Per Minute (NOPM); faster is better. The price/performance calculations are based on the cost of on-demand, License Included SQL Server instances and storage to achieve 1,000 NOPM performance, smaller is better.

    An EC2 r5b.8xlarge instance powered by an Intel Xeon Scalable processor delivered better SQL Server NOPM performance on the HammerDB benchmark and a lower price per 1,000 NOPM than an Azure E64_32s_v4 VM powered by similar Intel Xeon Scalable processors.

    On top of that, AWS’s storage price-performance exceeded Azure’s. The Azure managed disks offered 53 percent more storage than the EBS storage, but the EC2 instance with EBS storage cost 24 percent less than the Azure VM with managed disks. Even by reducing Azure storage by the difference in storage, something customers cannot do, EBS would have cost 13 percent less per storage GB than the Azure managed disks.

    Why AWS is the best cloud to run your Windows and SQL Server workloads

    To us, these results aren’t surprising. In fact, they’re in line with the success that customers find running Windows on AWS for over 12 years. Customers like Pearson and Expedia have all found better performance and enhanced cost savings by moving their Windows, SQL Server, and .NET workloads to AWS. In fact, RepricerExpress migrated its Windows and SQL Server environments from Azure to AWS to slash outbound bandwidth costs while gaining agility and performance.

    Not only do we offer better price-performance for your Windows workloads, but we also offer better ways to run Windows in the cloud. Whether you want to rehost your databases to EC2, move to managed with Amazon Relational Database for SQL Server (RDS), or even modernize to cloud-native databases, AWS stands ready to help you get the most out of the cloud.

     


    To learn more on migrating Windows Server or SQL Server, visit Windows on AWS. For more stories about customers who have successfully migrated and modernized SQL Server workloads with AWS, visit our Customer Success page. Contact us to start your migration journey today.

    Creating a cross-region Active Directory domain with AWS Launch Wizard for Microsoft Active Directory

    Post Syndicated from AWS Admin original https://aws.amazon.com/blogs/compute/creating-a-cross-region-active-directory-domain-with-aws-launch-wizard-for-microsoft-active-directory/

    AWS Launch Wizard is a console-based service to quickly and easily size, configure, and deploy third party applications, such as Microsoft SQL Server Always On and HANA based SAP systems, on AWS without the need to identify and provision individual AWS resources. AWS Launch Wizard offers an easy way to deploy enterprise applications and optimize costs. Instead of selecting and configuring separate infrastructure services, you go through a few steps in the AWS Launch Wizard and it deploys a ready-to-use application on your behalf. It reduces the time you need to spend on investigating how to provision, cost and configure your application on AWS.

    You can now use AWS Launch Wizard to deploy and configure self-managed Microsoft Windows Server Active Directory Domain Services running on Amazon Elastic Compute Cloud (EC2) instances. With Launch Wizard, you can have fully-functioning, production-ready domain controllers within a few hours—all without having to manually deploy and configure your resources.

    You can use AWS Directory Service to run Microsoft Active Directory (AD) as a managed service, without the hassle of managing your own infrastructure. If you need to run your own AD infrastructure, you can use AWS Launch Wizard to simplify the deployment and configuration process.

    In this post, I walk through creation of a cross-region Active Directory domain using Launch Wizard. First, I deploy a single Active Directory domain spanning two regions. Then, I configure Active Directory Sites and Services to match the network topology. Finally, I create a user account to verify replication of the Active Directory domain.

    Diagram of Resources deployed in this post

    Figure 1: Diagram of resources deployed in this post

    Prerequisites

    1. You must have a VPC in your home. Additionally, you must have remote regions that have CIDRs that do not overlap with each other. If you need to create VPCs and subnets that do not overlap, please refer here.
    2. Each subnet used must have outbound internet connectivity. Feel free to either use a NAT Gateway or Internet Gateway.
    3. The VPCs must be peered in order to complete the steps in this post. For information on creating a VPC Peering connection between regions, please refer here.
    4. If you choose to deploy your Domain Controllers to a private subnet, you must have an RDP jump / bastion instance setup to allow you to RDP to your instance.

    Deploy Your Domain Controllers in the Home Region using Launch Wizard

    In this section, I deploy the first set of domain controllers into the us-east-1 the home region using Launch Wizard. I refer to US-East-1 as the home region, and US-West-2 as the remote region.

    1. In the AWS Launch Wizard Console, select Active Directory in the navigation pane on the left.
    2. Select Create deployment.
    3. In the Review Permissions page, select Next.
    4. In the Configure application settings page set the following:
      • General:
        • Deployment name: UsEast1AD
      • Active Directory (AD) installation
        • Installation type: Active Directory on EC2
      • Domain Settings:
        • Number of domain controllers: 2
        • AMI installation type: License-included AMI
      • License-included AMI: ami-################# | Windows_Server-2019-English-Full-Base-202#-##-##
      • Connection type: Create new Active Directory
      • Domain DNS name: corp.example.com
      • Domain NetBIOS Name: CORP
      • Connectivity:
        • Key Pair Name: Choose and exiting Key pair or select and existing one.
        • Virtual Private Cloud (VPC): Select Virtual Private Cloud (VPC)
      • VPC: Select your home region VPC
      • Availability Zone (AZ) and private subnets:
        • Select 2 Availability Zones
        • Choose the proper subnet in each subnet
        • Assign a Controller IP address for each domain controller
      • Remote Desktop Gateway preferences: Disregard for now, this is set up later.
      • Check the I confirm that a public subnet has been set up. Each of the selected private subnets have outbound connectivity enabled check box.
    1. Select Next.
    2. In the Define infrastructure requirements page, set the following inputs.
      • Storage and compute: Based on infrastructure requirements
      • Number of AD users: Up to 5000 users
    3. Select Next.
    4. In the Review and deploy page, review your selections. Then, select Deploy.

    Note that it may take up to 2 hours for your domain to be deployed. Once the status has changed to Completed, you can proceed to the next section. In the next section, I prepare Active Directory Sites and Services for the second set of domain controller in my other region.

    Configure Active Directory Sites and Services

    In this section, I configure the Active Directory Sites and Services topology to match my network topology. This step ensures proper Active Directory replication routing so that domain clients can find the closest domain controller. For more information on Active Directory Sites and Services, please refer here.

    Retrieve your Administrator Credentials from Secrets Manager

    1. From the AWS Secrets Manager Console in us-east-1, select the Secret that begins with LaunchWizard-UsEast1AD.
    2. In the middle of the Secret page, select Retrieve secret value.
      1. This will display the username and password key with their values.
      2. You need these credentials when you RDP into one of the domain controllers in the next steps.

    Rename the Default First Site

    1. Log in to the one of the domain controllers in us-east-1.
    2. Select Start, type dssite and hit Enter on your keyboard.
    3. The Active Directory Sites and Services MMC should appear.
      1. Expand Sites. There is a site named Default-First-Site-Name.
      2. Right click on Default-First-Site-Name select Rename.
      3. Enter us-east-1 as the name.
    4. Leave the Active Directory Sites and Services MMC open for the next set of steps.

    Create a New Site and Subnet Definition for US-West-2

    1. Using the Active Directory Sites and Services MMC from the previous steps, right click on Sites.
    2. Select New Site… and enter the following inputs:
      • Name: us-west-2
      • Select DEFAULTIPSITELINK.
    3.  Select OK.
    4. A pop up will appear telling you there will need to be some additional configuration. Select OK.
    5. Expand Sites and right click on Subnets and select New Subnet.
    6. Enter the following information:
      • Prefix: the CIDR of your us-west-2 VPC. An example would be 1.0.0/24
      • Site: select us-west-2
    7. Select OK.
    8. Leave the Active Directory Sites and Services MMC open for the following set of steps.

    Configure Site Replication Settings

    Using the Active Directory Sites and Services MMC from the previous steps, expand Sites, Inter-Site Transports, and select IP. You should see an object named DEFAULTIPSITELINK,

    1. Right click on DEFAULTIPSITELINK.
    2. Select Properties. Set or verify the following inputs on the General tab:
    3. Select Apply.
    4. In the DEFAULTIPSITELINK Properties, select the Attribute Editor tab and modify the following:
      • Scroll down and double click on Enter 1 for the Value, then select OK twice.
        • For more information on these settings, please refer here.
    5. Close the Active Directory Sites and Services MMC, as it is no longer needed.

    Prepare Your Home Region Domain Controllers Security Group

    In this section, I modify the Domain Controllers Security Group in us-east-1. This allows the domain controllers deployed in us-west-2 to communicate with each other.

    1. From the Amazon Elastic Compute Cloud (Amazon EC2) console, select Security Groups under the Network & Security navigation section.
    2. Select the Domain Controllers Security Group that was created with Launch Wizard Active Directory.
    3. Select Edit inbound rules. The Security Group should start with LaunchWizard-UsEast1AD-.
    4. Choose Add rule and enter the following:
      • Type: Select All traffic
      • Protocol: All
      • Port range: All
      • Source: Select Custom
      • Enter the CIDR of your remote VPC. An example would be 1.0.0/24
    5. Select Save rules.

    Create a Copy of Your Administrator Secret in Your Remote Region

    In this section, I create a Secret in Secrets Manager that contains the Administrator credentials when I created a home region.

    1. Find the Secret that being with LaunchWizard-UsEast1AD from the AWS Secrets Manager Console in us-east-1.
    2. In the middle of the Secret page, select Retrieve secret value.
      • This displays the username and password key with their values. Make note of these keys and values, as we need them for the next steps.
    3. From the AWS Secrets Manager Console, change the region to us-west-2.
    4. Select Store a new secret. Then, enter the following inputs:
      • Select secret type: Other type of secrets
      • Add your first keypair
      • Select Add row to add the second keypair
    5. Select Next, then enter the following inputs.
      • Secret name: UsWest2AD
      • Select Next twice
      • Select Store

    Deploy Your Domain Controllers in the Remote Region using Launch Wizard

    In this section, I deploy the second set of domain controllers into the us-west-1 region using Launch Wizard.

    1. In the AWS Launch Wizard Console, select Active Directory in the navigation pane on the left.
    2. Select Create deployment.
    3. In the Review Permissions page, select Next.
    4. In the Configure application settings page, set the following inputs.
      • General
        • Deployment name: UsWest2AD
      • Active Directory (AD) installation
        • Installation type: Active Directory on EC2
      • Domain Settings:
        • Number of domain controllers: 2
        • AMI installation type: License-included AMI
        • License-included AMI: ami-################# | Windows_Server-2019-English-Full-Base-202#-##-##
      • Connection type: Add domain controllers to existing Active Directory
      • Domain DNS name: corp.example.com
      • Domain NetBIOS Name: CORP
      • Domain Administrator secret name: Select you secret you created above.
      • Add permission to secret
        • After you verified the Secret you created above has the policy listed. Check the checkbox confirming the secret has the required policy.
      • Domain DNS IP address for resolution: The private IP of either domain controller in your home region
      • Connectivity:
        • Key Pair Name: Choose an existing Key pair
        • Virtual Private Cloud (VPC): Select Virtual Private Cloud (VPC)
      • VPC: Select your home region VPC
      • Availability Zone (AZ) and private subnets:
        • Select 2 Availability Zones
        • Choose the proper subnet in each subnet
        • Assign a Controller IP address for each domain controller
      • Remote Desktop Gateway preferences: disregard for now, as I set this later.
      • Check the I confirm that a public subnet has been set up. Each of the selected private subnets have outbound connectivity enabled check box
    1. In the Define infrastructure requirements page set the following:
      • Storage and compute: Based on infrastructure requirements
      • Number of AD users: Up to 5000 users
    2. In the Review and deploy page, review your selections. Then, select Deploy.

    Note that it may take up to 2 hours to deploy domain controllers. Once the status has changed to Completed, proceed to the next section. In this next section, I prepare Active Directory Sites and Services for the second set of domain controller in another region.

    Prepare Your Remote Region Domain Controllers Security Group

    In this section, I modify the Domain Controllers Security Group in us-west-2. This allows the domain controllers deployed in us-west-2 to communicate with each other.

    1. From the Amazon Elastic Compute Cloud (Amazon EC2) console, select Security Groups under the Network & Security navigation section.
    2. Select the Domain Controllers Security Group that was created by your Launch Wizard Active Directory.
    3. Select Edit inbound rules. The Security Group should start with LaunchWizard-UsWest2AD-EC2ADStackExistingVPC-
    4. Choose Add rule and enter the following:
      • Type: Select All traffic
      • Protocol: All
      • Port range: All
      • Source: Select Custom
      • Enter the CIDR of your remote VPC. An example would be 0.0.0/24
    5. Choose Save rules.

    Create an AD User and Verify Replication

    In this section, I create a user in one region and verify that it replicated to the other region. I also use AD replication diagnostics tools to verify that replication is working properly.

    Create a Test User Account

    1. Log in to one of the domain controllers in us-east-1.
    2. Select Start, type dsa and press Enter on your keyboard. The Active Directory Users and Computers MMC should appear.
    3. Right click on the Users container and select New > User.
    4. Enter the following inputs:
      • First name: John
      • Last name: Doe
      • User logon name: jdoe and select Next
      • Password and Confirm password: Your choice of complex password
      • Uncheck User must change password at next logon
    5. Select Next.
    6. Select Finish.

    Verify Test User Account Has Replicated

    1. Log in to the one of the domain controllers in us-west-2.
    2. Select Start and type dsa.
    3. Then, press Enter on your keyboard. The Active Directory Users and Computers MMC should appear.
    4. Select Users. You should see a user object named John Doe.

    Note that if the user is not present, it may not have been replicated yet. Replication should not take longer than 60 seconds from when the item was created.

    Summary

    Congratulations, you have created a cross-region Active Directory! In this post you:

    1. Launched a new Active Directory forest in us-east-1 using AWS Launch Wizard.
    2. Configured Active Directory Sites and Service for a multi-region configuration.
    3. Launched a set of new domain controllers in the us-west-2 region using AWS Launch Wizard.
    4. Created a test user and verified replication.

    This post only touches on a couple of features that are available in the AWS Launch Wizard Active Directory deployment. AWS Launch Wizard also automates the creation of a Single Tier PKI infrastructure or trust creation. One of the prime benefits of this solution is the simplicity in deploying a fully functional Active Directory environment in just a few clicks. You no longer need to do the undifferentiated heavy lifting required to deploy Active Directory.  For more information, please refer to AWS Launch Wizard documentation.