All posts by Sonu Kumar Singh

Apache Spark encryption performance improvement with Amazon EMR 7.9

2025-11-27 Sonu Kumar Singh

Post Syndicated from Sonu Kumar Singh original https://aws.amazon.com/blogs/big-data/apache-spark-encryption-performance-improvement-with-amazon-emr-7-9/

The Amazon EMR runtime for Apache Spark is a performance-optimized runtime for Apache Spark that is 100% API compatible with open source Apache Spark. With Amazon EMR release 7.9.0, the EMR runtime for Apache Spark introduces significant performance improvements for encrypted workloads, supporting Spark version 3.5.5.

For compliance and security requirements, many customers need to enable Apache Spark’s local storage encryption (spark.io.encryption.enabled = true) in addition to Amazon Simple Storage Service (Amazon S3) encryption (such as server-side encryption (SSE) or AWS Key Management Service (AWS KMS)). This feature encrypts shuffle files, cached data, and other intermediate data written to local disk during Spark operations, protecting sensitive data at rest on Amazon EMR cluster instances.

Industries subject to regulations such as the Health Insurance Portability and Accountability Act (HIPAA) for healthcare, Payment Card Industry Data Security Standard (PCI-DSS) for financial services, General Data Protection Regulation (GDPR) for personal data, and Federal Risk and Authorization Management Program (FedRAMP) for government often require encryption of all data at rest, including temporary files on local storage. While Amazon S3 encryption protects data in object storage, Spark’s I/O encryption secures the intermediate shuffle and spill data that Spark writes to local disk during distributed processing—data that never reaches Amazon S3 but might contain sensitive information extracted from source datasets. Generally, encrypted operations require additional computational overhead that can impact overall job performance.

With the built-in encryption optimizations of Amazon EMR 7.9.0, customers might see significant performance improvements in their Apache Spark applications without requiring any application changes. In our performance benchmark tests, derived from TPC-DS performance tests at 3 TB scale, we observed up to 20% faster performance with the EMR 7.9 optimized Spark runtime compared to Spark without these optimizations. Individual results may vary depending on specific workloads and configurations.

In this post, we analyze the results from our benchmark tests comparing the Amazon EMR 7.9 optimized Spark runtime against Spark 3.5.5 without encryption optimizations. We walk through a detailed cost analysis and provide step-by-step instructions to reproduce the benchmark.

Results observed

To evaluate the performance improvements, we used an open source Spark performance test utility derived from the TPC-DS performance test toolkit. We ran the tests on two nine-node (eight core nodes and one primary node) r5d.4xlarge Amazon EMR 7.9.0 clusters, comparing two configurations:

Baseline: EMR 7.9.0 cluster with a bootstrap action installing Spark 3.5.5 without encryption optimizations
Optimized: EMR 7.9.0 cluster using the EMR Spark 3.5.5 runtime with encryption optimizations

Both tests used data stored in Amazon Simple Storage Service (Amazon S3). All data processing was configured identically except for the Spark runtime version.

To maintain benchmarking consistency and ensure a consistent, equivalent comparison, we disabled Dynamic Resource Allocation (DRA) in both test configurations. This approach eliminates variability from dynamic scaling and so we can measure pure computational performance improvements.

The following table shows the total job runtime for all queries (in seconds) in the 3 TB query dataset between the baseline and Amazon EMR 7.9 optimized configurations:

Configuration	Total runtime (seconds)	Geometric mean (seconds)	Performance improvement
Baseline (Spark 3.5.5 without optimization)	1,485	10.24
EMR 7.9 (with encryption optimization)	1,176	8.15	20% faster

We observed that our TPC-DS tests with the Amazon EMR 7.9 optimized Spark runtime completed about 20% faster based on total runtime and 20% faster based on geometric mean compared to the baseline configuration.

The encryption optimizations in Amazon EMR 7.9 deliver performance benefits through:

Improved shuffle and decryption operations reducing overhead during data exchange without compromising security
Better memory management for intermediate results

Cost analysis

The performance improvements of the Amazon EMR 7.9 optimized Spark runtime directly translate to lower costs. We realized an approximately 20% cost savings running the benchmark application with encryption optimizations compared to the baseline configuration, because of reduced hours of EMR, Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Block Store (Amazon EBS) using General Purpose SSD (gp2).

The following table summarizes the cost comparison in the us-east-1 AWS Region:

Configuration	Runtime (hours)	Estimated cost	Total EC2 instances	Total vCPU	Total memory (GiB)	Root device (EBS)
Baseline: Spark 3.5.5 without optimization, 1 primary and 8 core nodes	0.41	$5.28	9	144	1152	64 GiB gp2
Amazon EMR 7.9 with optimization, 1 primary and 8 core nodes	0.33	$4.25	9	144	1152	64 GiB gp2

Cost breakdown

Formulas used:

Amazon EMR cost – Number of instances × EMR hourly rate × Runtime hours
Amazon EC2 cost – Number of instances × EC2 hourly rate × Runtime hour)
Amazon EBS cost – (EBS cost per GB per month ÷ hours in a month) × EBS volume size × number of instances × runtime hours

Note: EBS is priced monthly ($0.1 per GB per month), so we divide by 730 hours to convert to an hourly rate. EMR and EC2 are already priced hourly, so no conversion is needed.

Baseline configuration (0.41 hours):

Amazon EMR cost – 9 × $0.27 × 0.41 = $1.00
Amazon EC2 cost – 9 × $1.152 × 0.41 = $4.25
Amazon EBS cost – ($0.1/730 × 64 × 9 × 0.41) = $0.032
Total cost – $5.28

EMR 7.9 optimized configuration (0.33 hours):

Amazon EMR cost – (9 × $0.27 × 0.33) = $0.80
Amazon EC2 cost – (9 × $1.152 × 0.33) = $3.42
Amazon EBS cost – ($0.1/730 × 64 × 9 × 0.33) = $0.025
Total cost: $4.25

Total cost savings: 20% per benchmark run, which scales linearly with your production workload frequency.

Set up EMR benchmarking

For detailed instructions and scripts, see the companion GitHub repository.

Prerequisites

To set up Amazon EMR benchmarking, start by completing the following prerequisite steps:

Configure your AWS Command Line Interface (AWS CLI) by running aws configure to point to your benchmarking account,
Create an S3 bucket for test data and results.
Copy the TPC-DS 3TB source data from a publicly available dataset to your S3 bucket using the following command:
```
aws s3 cp s3://blogpost-sparkoneks-us-east-1/blog/BLOG_TPCDS-TEST-3T-partitioned s3://<YOUR-BUCKET-NAME>/BLOG_TPCDS-TEST-3T-partitioned --recursive
```
Replace <YOUR-BUCKET-NAME> with the name of the S3 bucket you created in step 2.
Build or download the benchmark application JAR file (spark-benchmark-assembly-3.3.0.jar)
Ensure you have appropriate AWS Identity Access Management (IAM) roles for EMR cluster creation and Amazon S3 access

Deploy the baseline EMR cluster (without optimization)

Step 1: Launch EMR 7.9.0 cluster with bootstrap action

The baseline configuration uses a bootstrap action to install Spark 3.5.5 without encryption optimizations. We have made the bootstrap script publicly available in an S3 bucket for your convenience.

Create the default Amazon EMR roles:

aws emr create-default-roles

Now create the cluster:

aws emr create-cluster \
  --name "EMR-7.9-Baseline-Spark-3.5.5" \
  --release-label emr-7.9.0 \
  --applications Name=Spark \
  --ec2-attributes SubnetId=<YOUR-SUBNET-ID>,InstanceProfile=EMR_EC2_DefaultRole  \
  --service-role EMR_DefaultRole
  --instance-groups \
    InstanceGroupType=MASTER,InstanceCount=1,InstanceType=r5d.4xlarge \
    InstanceGroupType=CORE,InstanceCount=8,InstanceType=r5d.4xlarge \
  --bootstrap-actions \
    Path=s3://spark-ba/install-spark-3-5-5-no-encryption.sh,Name="install spark 3.5.5 without encryption optimization" \
  --use-default-roles \
  --log-uri s3://<YOUR-BUCKET-NAME>/logs/baseline/

Note: The bootstrap script is available in a public S3 bucket at s3://spark-ba/install-spark-3-5-5-no-encryption.sh. This script installs Apache Spark 3.5.5 without the encryption optimizations present in the Amazon EMR runtime.

Step 2: Submit the benchmark job to the baseline cluster

Next submit the Spark job using the following commands:

aws emr add-steps \
  --cluster-id <YOUR-BASELINE-CLUSTER-ID> \  
  --steps 'Type=Spark,Name="EMR-7.9-Baseline-Spark-3.5.5 Step",ActionOnFailure=CONTINUE,Args=["--deploy-mode","client","--conf","spark.io.encryption.enabled=false","--class","com.amazonaws.eks.tpcds.BenchmarkSQL","s3://<YOUR-BUCKET-NAME>/jar/spark-benchmark-assembly-3.3.0.jar","s3:// <YOUR-BUCKET-NAME>/blog/BLOG_TPCDS-TEST-3T-partitioned","s3:// <YOUR-BUCKET-NAME>/blog/BASELINE_TPCDS-TEST-3T-RESULT","/opt/tpcds-kit/tools","parquet","3000","3","false","q1-v2.4,q10-v2.4,q11-v2.4,q12-v2.4,q13-v2.4,q14a-v2.4,q14b-v2.4,q15-v2.4,q16-v2.4,q17-v2.4,q18-v2.4,q19-v2.4,q2-v2.4,q20-v2.4,q21-v2.4,q22-v2.4,q23a-v2.4,q23b-v2.4,q24a-v2.4,q24b-v2.4,q25-v2.4,q26-v2.4,q27-v2.4,q28-v2.4,q29-v2.4,q3-v2.4,q30-v2.4,q31-v2.4,q32-v2.4,q33-v2.4,q34-v2.4,q35-v2.4,q36-v2.4,q37-v2.4,q38-v2.4,q39a-v2.4,q39b-v2.4,q4-v2.4,q40-v2.4,q41-v2.4,q42-v2.4,q43-v2.4,q44-v2.4,q45-v2.4,q46-v2.4,q47-v2.4,q48-v2.4,q49-v2.4,q5-v2.4,q50-v2.4,q51-v2.4,q52-v2.4,q53-v2.4,q54-v2.4,q55-v2.4,q56-v2.4,q57-v2.4,q58-v2.4,q59-v2.4,q6-v2.4,q60-v2.4,q61-v2.4,q62-v2.4,q63-v2.4,q64-v2.4,q65-v2.4,q66-v2.4,q67-v2.4,q68-v2.4,q69-v2.4,q7-v2.4,q70-v2.4,q71-v2.4,q72-v2.4,q73-v2.4,q74-v2.4,q75-v2.4,q76-v2.4,q77-v2.4,q78-v2.4,q79-v2.4,q8-v2.4,q80-v2.4,q81-v2.4,q82-v2.4,q83-v2.4,q84-v2.4,q85-v2.4,q86-v2.4,q87-v2.4,q88-v2.4,q89-v2.4,q9-v2.4,q90-v2.4,q91-v2.4,q92-v2.4,q93-v2.4,q94-v2.4,q95-v2.4,q96-v2.4,q97-v2.4,q98-v2.4,q99-v2.4,ss_max-v2.4","true"]'

Deploy the optimized EMR cluster (with encryption optimization)

Step 1: Launch EMR 7.9.0 cluster with Spark runtime

The optimized configuration uses the EMR 7.9.0 Spark runtime without any bootstrap actions:

aws emr create-cluster \
  --name "EMR-7.9-Optimized-Native-Spark" \
  --release-label emr-7.9.0 \
  --applications Name=Spark \
  --ec2-attributes SubnetId=<YOUR-SUBNET-ID>,InstanceProfile=EMR_EC2_DefaultRole \
  --service-role EMR_DefaultRole
  --instance-groups \
    InstanceGroupType=MASTER,InstanceCount=1,InstanceType=r5d.4xlarge \
    InstanceGroupType=CORE,InstanceCount=8,InstanceType=r5d.4xlarge \
  --use-default-roles \
  --log-uri s3://<YOUR-BUCKET-NAME>/logs/optimized/

Example:

aws emr create-cluster \
--name "EMR-7.9-Optimized-Native-Spark" \
--release-label emr-7.9.0 \
--applications Name=Spark \
--ec2-attributes SubnetId=subnet-08a5f71f92bc8a801 \
--instance-groups \
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=r5d.4xlarge \
InstanceGroupType=CORE,InstanceCount=8,InstanceType=r5d.4xlarge \
--bootstrap-actions \
Path=s3://spark-ba/install-spark-3-5-5-no-encryption.sh,Name="install spark 3.5.5 without encryption optimization" \
--use-default-roles \
--log-uri s3://aws-logs-123456789012-us-west-2/elasticmapreduce/

Step 2: Submit the benchmark job to optimized cluster

ext submit the Spark job using the following commands:

aws emr add-steps \
  --cluster-id <YOUR-OPTIMIZED-CLUSTER-ID> \ 
  --steps 'Type=Spark,Name="EMR-7.9-Optimized-Native-Spark Step",ActionOnFailure=CONTINUE,Args=["--deploy-mode","client","--conf","spark.io.encryption.enabled=true","--class","com.amazonaws.eks.tpcds.BenchmarkSQL","s3://<YOUR-BUCKET-NAME>/jar/spark-benchmark-assembly-3.3.0.jar","s3://<YOUR-BUCKET-NAME>/blog/BLOG_TPCDS-TEST-3T-partitioned","s3://<YOUR-BUCKET-NAME>/blog/BASELINE_TPCDS-TEST-3T-RESULT","/opt/tpcds-kit/tools","parquet","3000","3","false","q1-v2.4,q10-v2.4,q11-v2.4,q12-v2.4,q13-v2.4,q14a-v2.4,q14b-v2.4,q15-v2.4,q16-v2.4,q17-v2.4,q18-v2.4,q19-v2.4,q2-v2.4,q20-v2.4,q21-v2.4,q22-v2.4,q23a-v2.4,q23b-v2.4,q24a-v2.4,q24b-v2.4,q25-v2.4,q26-v2.4,q27-v2.4,q28-v2.4,q29-v2.4,q3-v2.4,q30-v2.4,q31-v2.4,q32-v2.4,q33-v2.4,q34-v2.4,q35-v2.4,q36-v2.4,q37-v2.4,q38-v2.4,q39a-v2.4,q39b-v2.4,q4-v2.4,q40-v2.4,q41-v2.4,q42-v2.4,q43-v2.4,q44-v2.4,q45-v2.4,q46-v2.4,q47-v2.4,q48-v2.4,q49-v2.4,q5-v2.4,q50-v2.4,q51-v2.4,q52-v2.4,q53-v2.4,q54-v2.4,q55-v2.4,q56-v2.4,q57-v2.4,q58-v2.4,q59-v2.4,q6-v2.4,q60-v2.4,q61-v2.4,q62-v2.4,q63-v2.4,q64-v2.4,q65-v2.4,q66-v2.4,q67-v2.4,q68-v2.4,q69-v2.4,q7-v2.4,q70-v2.4,q71-v2.4,q72-v2.4,q73-v2.4,q74-v2.4,q75-v2.4,q76-v2.4,q77-v2.4,q78-v2.4,q79-v2.4,q8-v2.4,q80-v2.4,q81-v2.4,q82-v2.4,q83-v2.4,q84-v2.4,q85-v2.4,q86-v2.4,q87-v2.4,q88-v2.4,q89-v2.4,q9-v2.4,q90-v2.4,q91-v2.4,q92-v2.4,q93-v2.4,q94-v2.4,q95-v2.4,q96-v2.4,q97-v2.4,q98-v2.4,q99-v2.4,ss_max-v2.4","true"]'

Benchmark command parameters explained

The Amazon EMR Spark step uses the following parameters:

EMR step configuration:
- Type=Spark: Specifies this is a Spark application step
- Name=”EMR-7.9-Baseline-Spark-3.5.5″: Human-readable name for the step
- ActionOnFailure=CONTINUE: Continue with other steps if this one fails
Spark submit arguments:
- –deploy-mode client: Run the driver on the master node (not cluster mode)
- –class com.amazonaws.eks.tpcds.BenchmarkSQL: Main class for the TPC-DS benchmark
Application parameters:
- JAR file: s3://<YOUR-BUCKET-NAME>/jar/spark-benchmark-assembly-3.3.0.jar
- Input data: s3://<YOUR-BUCKET-NAME>/blog/BLOG_TPCDS-TEST-3T-partitioned (3 TB TPC-DS dataset)
- Output location: s3://<YOUR-BUCKET-NAME>/blog/BASELINE_TPCDS-TEST-3T-RESULT (S3 path for results)
- TPC-DS tools path: /opt/tpcds-kit/tools(local path on EMR nodes)
- Format: parquet (output format)
- Scale factor: 3000 (3 TB dataset size)
- Iterations: 3 (run each query 3 times for averaging)
- Collect results: false (don’t collect results to driver)
- Query list: "q1-v2.4,q10-v2.4,...,ss_max-v2.4" (all 104 TPC-DS queries)
- Final parameter: true (enable detailed logging and metrics)
Query coverage:
- All 104 standard TPC-DS benchmark queries (q1-v2.4 through q99-v2.4)
- Plus the ss_max-v2.4 query for additional testing
- Each query runs 3 times to calculate average performance

Summarize the results

Download the test result files from both output S3 locations:

# Baseline results
aws s3 cp s3://<YOUR-BUCKET-NAME>/blog/BASELINE_TPCDS-TEST-3T-RESULT/timestamp=xxxx/summary.csv/xxx.csv ./baseline-results.csv
   
# Optimized results
aws s3 cp s3://<YOUR-BUCKET-NAME>/blog/OPTIMIZED_TPCDS-TEST-3T-RESULT/timestamp=xxxx/summary.csv/xxx.csv ./optimized-results.csv

The CSV files contain four columns (without headers):
- Query name
- Median time (seconds)
- Minimum time (seconds)
- Maximum time (seconds)
Calculate performance metrics for comparison:
- Average time per query: AVERAGE(median, min, max) for each query
- Total runtime: Sum of all median times
- Geometric mean: GEOMEAN(average times) across all queries
- Speedup: Calculate the ratio between baseline and optimized for each query
Create comparison analysis:Speedup = (Baseline Time - Optimized Time) / Baseline Time * 100%

Testing configuration details

The following table summarizes the test environment used for this post:

Parameter	Value
EMR release	emr-7.9.0 (both configurations)
Baseline Spark version	3.5.5 (installed through bootstrap action)
Baseline bootstrap script	s3://spark-ba/install-spark-3-5-5-no-encryption.sh (public)
Optimized spark version	Amazon EMR Spark runtime
Cluster size	9 nodes (1 primary and 8 core)
Instance type	r5d.4xlarge
vCPUs per node	16
Memory per node	128 GB
Instance storage	600 GB SSD
EBS volume	64 GB gp2 (2 volumes per instance)
Total vCPUs	144 (9 × 16)
Total memory	1152 GB (9 × 128)
Dataset	TPC-DS 3TB (Parquet format)
Queries	104 queries (TPC-DS v2.4)
Iterations	3 runs per query
DRA	Disabled for consistent benchmarking

Clean up

To avoid incurring future charges, delete the resources you created:

Terminate both EMR clusters:

aws emr terminate-clusters --cluster-ids <YOUR-BASELINE-CLUSTER-ID> <YOUR-OPTIMIZED-CLUSTER-ID>

Delete S3 test results if no longer needed:

aws s3 rm s3://<YOUR-BUCKET-NAME>/blog/BASELINE_TPCDS-TEST-3T-RESULT/ --recursive
aws s3 rm s3://<YOUR-BUCKET-NAME>/blog/OPTIMIZED_TPCDS-TEST-3T-RESULT/ --recursive
aws s3 rm s3://<YOUR-BUCKET-NAME>/logs/ --recursive

Remove IAM roles if created specifically for testing

Key findings

Up to 20% performance improvement using the Amazon EMR 7.9’s Spark runtime with no code changes required
20% cost savings because of reduced runtime
Significant gains for shuffle-heavy, join-intensive workloads
100% API compatibility with open source Apache Spark
Simple migration from custom Spark builds to EMR runtime
Easy benchmarking using publicly available bootstrap scripts

Conclusion

You can run your Apache Spark workloads up to 20% faster and at lower cost without making any changes to your applications by using the Amazon EMR 7.9.0 optimized Spark runtime. This improvement is achieved through numerous optimizations in the EMR Spark runtime, including enhanced encryption handling, improved data serialization, and optimized shuffle operations.

To learn more about Amazon EMR 7.9 and best practices, see the EMR documentation. For configuration guidance and tuning advice, subscribe to the AWS Big Data Blog.

Related resources:

If you’re running Spark workloads on Amazon EMR today, we encourage you to test the EMR 7.9 Spark runtime with your production workloads and measure the improvements specific to your use case.

About the authors

Your guide to AWS Analytics at AWS re:Invent 2025

2025-11-13 Sonu Kumar Singh

Post Syndicated from Sonu Kumar Singh original https://aws.amazon.com/blogs/big-data/your-guide-to-aws-analytics-at-aws-reinvent-2025/

re:Invent banner

It’s that time of year again — AWS re:Invent is here! At re:Invent, bold ideas come to life. Get a front-row seat to hear inspiring stories from AWS experts, customers, and leaders as they explore today’s most impactful topics, from data analytics to AI.

For all the data enthusiasts and professionals, we’ve curated a comprehensive guide to every analytics session to help you plan your perfect agenda. Make sure to secure your seat early for must-attend sessions via the attendee portal.

Pro tip: Even if a session shows as fully reserved, we encourage you to join the walk-up line at the session location. Based on previous years’ experiences, additional seats often become available due to no-shows or last-minute schedule changes. The walk-up line operates on a first-come, first-served basis, and many attendees have successfully accessed their desired sessions this way. Just be sure to arrive at least 15 minutes before the session starts for the best chance of getting a seat.

Can’t make it in person? No problem — grab a free virtual pass to stream live sessions from anywhere.

And don’t forget to stop by the AWS Kiosk in the AWS Village Expo for AWS Analytics, Amazon SageMaker, Amazon OpenSearch Service and AWS Messaging and Streaming services! See live demos of analytics services, meet AWS experts, get your toughest data questions answered, explore the latest launches, join our data trivia, and even win exclusive AWS-authored books and many more swags.

Data Innovation Talk

INV201 | Harnessing analytics for humans and AI

Emerging trends, ranging from Open Table Formats (OTF) to agentic infrastructure, are rapidly changing how humans and applications interact with analytics to drive mission-critical business decisions. Join Mai-Lan Tomsen Bukovec, VP of AWS Technology, to explore emerging trends, the evolution of analytics engines and applications, and how to future-proof your data foundation for the rapidly changing landscape of analytics at scale. Learn how AWS is transforming data and analytics services to lead in optimized data storage, querying, streaming, processing, and governance – for both human users and agentic infrastructure.

Breakouts

Dive into cutting-edge topics with re:Invent breakout sessions. These immersive, hour-long lectures are led by AWS experts, customers, offering you unparalleled insights and knowledge in a concise format. Whether you’re exploring the latest in cloud technology, AWS Analytics advancements, or industry-specific solutions, these sessions are designed to expand your horizon and inspire your next big idea.

Monday, Dec 1	Tuesday, Dec 2	Wednesday, Dec 3	Thursday, Dec 4
8:30 AM – 9:30 AM PST \| Venetian \| Level 3 \| Lido 3106 ANT203 \| Enabling AI innovation with Amazon SageMaker Unified Studio	11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Turquoise Theater BIZ207 \| Democratize access to insights with Amazon Quick Suite	8:30 AM – 9:30 AM PST \| MGM \| Level 1 \| Grand 123 ANT204 \| Architecting the future: Amazon SageMaker as a data and AI platform	11:00 AM – 12:00 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Pink Theater ANT317 \| Modernize your data warehouse by moving to Amazon Redshift
8:30 AM – 9:30 AM PST \| MGM \| Level 3 \| Chairman’s 366 ANT318 \| Scaling Amazon Redshift with a multi-warehouse architecture	11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Pink Theater ANT216 \| What’s new with Amazon SageMaker in the era of unified data and AI	10:00 AM – 11:00 AM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Turquoise Theater ANT335 \| Agentic data engineering with AWS Analytics MCP Servers	11:30 AM – 12:30 PM PST \| Wynn \| Upper Convention Promenade \| Cristal 7 ANT328 \| Data Processing architectures for building AI solutions
9:00 AM – 10:00 AM PST \| Wynn \| Convention Promenade \| Lafite 7 \| Content Hub \| Mint Green Theater ANT307 \| Operating Apache Kafka and Apache Flink at scale	1:30 PM – 2:30 PM PST \| MGM \| Level 3 \| Chairman’s 364 BIZ203 \| Amazon’s journey deploying Quick Suite across thousands of users	10:00 AM – 11:00 AM PST \| Wynn \| Upper Convention Promenade \| Bollinger ANT304 \| Build an AI-ready data foundation	1:00 PM – 2:00 PM PST \| MGM \| Level 1 \| Grand 122 BIZ227 \| Generate new revenue streams with Amazon Quick Sight embedded
10:00 AM – 11:00 AM PST \| Wynn \| Upper Convention Promenade \| Bollinger BIZ331 \| Build robust data foundations to power enterprise AI and BI	1:30 PM – 2:30 PM PST \| Wynn \| Convention Promenade \| Lafite 7 \| Content Hub \| Mint Green Theater ANT206 \| What’s new in Amazon Redshift and Amazon Athena	11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Turquoise Theater ANT424 \| Autonomous agents powered by streaming data and Retrieval Augmented Generation	2:00 PM – 3:00 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Turquoise Theater ANT343 \| Best practices for building Apache Iceberg based lakehouse architectures on AWS
10:00 AM – 11:00 AM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Pink Theater ANT209 \| Universal data connectivity with ETL and SQL queries	4:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Turquoise Theater ANT308 \| Explore what’s new in data and AI governance with SageMaker Catalog	11:30 AM – 12:30 PM PST \| Wynn \| Convention Promenade \| Lafite 7 \| Content Hub \| Pink Theater ANT310 \| Powering your Agentic AI experience with AWS Streaming and Messaging	4:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 3 South \| South Seas E ANT344 \| Build, govern, and share Amazon Quick Suite dashboards with Amazon SageMaker
10:30 AM – 11:30 AM PST \| MGM \| Level 1 \| Grand 116 ANT314 \| Build Advanced Search with Vector, Hybrid, and AI Techniques	4:30 PM – 5:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Mint Green Theater ANT305 \| Innovations in AWS analytics: Data processing	2:30 PM – 3:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Pink Theater ANT315 \| Intelligent Observability and Modernization with Amazon OpenSearch Service	4:00 PM – 5:00 PM PST \| Wynn \| Convention Promenade \| Lafite 7 \| Content Hub \| Orange Theater DAT445 \| Deep dive into databases zero-ETL integrations
12:00 PM – 1:00 PM PST \| MGM \| Level 3 \| Chairman’s 360 ANT336 \| Enterprise-scale ETL optimization for Apache Spark	.	3:00 PM – 4:00 PM PST \| MGM \| Level 1 \| Grand 122 ANT309 \| Accelerate analytics and AI with an open and secure lakehouse architecture	.
12:00 PM – 1:00 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Orange Theater ANT339 \| Turn unstructured data in Amazon S3 into AI-ready assets with SageMaker Catalog	.	.	.
1:00 PM – 2:00 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Pink Theater ANT201 \| What’s new in search, observability, and vector databases with OpenSearch	.	.	.
1:30 PM – 2:30 PM PST \| Wynn \| Convention Promenade \| Lafite 7 \| Content Hub \| Orange Theater BIZ228 \| Reimagine business intelligence with Amazon Quick Sight	.	.	.
1:30 PM – 2:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas E OPN413 \| Transforming Apache Kafka into a Scalable Message Queue	.	.	.
5:30 PM – 6:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas F ANT423 \| Amazon Kinesis Data Streams under the hood	.	.	.

Monday, Dec 1

Tuesday, Dec 2

Wednesday, Dec 3

Thursday, Dec 4

8:30 AM – 9:30 AM PST | Venetian | Level 3 | Lido 3106

ANT203 | Enabling AI innovation with Amazon SageMaker Unified Studio

BIZ207 | Democratize access to insights with Amazon Quick Suite

8:30 AM – 9:30 AM PST | MGM | Level 1 | Grand 123

ANT204 | Architecting the future: Amazon SageMaker as a data and AI platform

ANT317 | Modernize your data warehouse by moving to Amazon Redshift

8:30 AM – 9:30 AM PST | MGM | Level 3 | Chairman’s 366

ANT318 | Scaling Amazon Redshift with a multi-warehouse architecture

ANT216 | What’s new with Amazon SageMaker in the era of unified data and AI

ANT335 | Agentic data engineering with AWS Analytics MCP Servers

11:30 AM – 12:30 PM PST | Wynn | Upper Convention Promenade | Cristal 7

ANT328 | Data Processing architectures for building AI solutions

ANT307 | Operating Apache Kafka and Apache Flink at scale

1:30 PM – 2:30 PM PST | MGM | Level 3 | Chairman’s 364

BIZ203 | Amazon’s journey deploying Quick Suite across thousands of users

10:00 AM – 11:00 AM PST | Wynn | Upper Convention Promenade | Bollinger

ANT304 | Build an AI-ready data foundation

1:00 PM – 2:00 PM PST | MGM | Level 1 | Grand 122

BIZ227 | Generate new revenue streams with Amazon Quick Sight embedded

10:00 AM – 11:00 AM PST | Wynn | Upper Convention Promenade | Bollinger

BIZ331 | Build robust data foundations to power enterprise AI and BI

ANT206 | What’s new in Amazon Redshift and Amazon Athena

ANT424 | Autonomous agents powered by streaming data and Retrieval Augmented Generation

ANT343 | Best practices for building Apache Iceberg based lakehouse architectures on AWS

ANT209 | Universal data connectivity with ETL and SQL queries

ANT308 | Explore what’s new in data and AI governance with SageMaker Catalog

ANT310 | Powering your Agentic AI experience with AWS Streaming and Messaging

4:00 PM – 5:00 PM PST | Mandalay Bay | Level 3 South | South Seas E

ANT344 | Build, govern, and share Amazon Quick Suite dashboards with Amazon SageMaker

10:30 AM – 11:30 AM PST | MGM | Level 1 | Grand 116

ANT314 | Build Advanced Search with Vector, Hybrid, and AI Techniques

ANT305 | Innovations in AWS analytics: Data processing

ANT315 | Intelligent Observability and Modernization with Amazon OpenSearch Service

DAT445 | Deep dive into databases zero-ETL integrations

12:00 PM – 1:00 PM PST | MGM | Level 3 | Chairman’s 360

ANT336 | Enterprise-scale ETL optimization for Apache Spark

3:00 PM – 4:00 PM PST | MGM | Level 1 | Grand 122

ANT309 | Accelerate analytics and AI with an open and secure lakehouse architecture

ANT339 | Turn unstructured data in Amazon S3 into AI-ready assets with SageMaker Catalog

ANT201 | What’s new in search, observability, and vector databases with OpenSearch

BIZ228 | Reimagine business intelligence with Amazon Quick Sight

1:30 PM – 2:30 PM PST | Mandalay Bay | Level 3 South | South Seas E

OPN413 | Transforming Apache Kafka into a Scalable Message Queue

5:30 PM – 6:30 PM PST | Mandalay Bay | Level 3 South | South Seas F

ANT423 | Amazon Kinesis Data Streams under the hood

Chalk talks

These hour-long, highly engaging sessions offer a unique blend of expert insight and collaborative learning. An AWS specialist kicks off with a concise, informative lecture, setting the stage for an in-depth, interactive Q&A. With a limited audience size, you’ll have the opportunity to dive deep into topics, ask pressing questions, and engage in meaningful discussions with both the presenter and fellow attendees.

Monday, Dec 1	Tuesday, Dec 2	Wednesday, Dec 3	Thursday, Dec 4	Friday, Dec 5
8:30 AM – 9:30 AM PST \| MGM \| Level 1 \| Boulevard 167 ANT301-R1 \| Accelerating the shift from batch to real-time streaming	11:30 AM – 12:30 PM PST \| Caesars Forum \| Level 1 \| Academy 411 ANT302-R1 \| Accelerate GenAI-powered data discovery and sharing with SageMaker Catalog	9:00 AM – 10:00 AM PST \| MGM \| Level 3 \| Room 353 ANT301-R \| Accelerating the shift from batch to real-time streaming	11:30 AM – 12:30 PM PST \| MGM \| Level 3 \| Room 353 ANT207 \| Develop with natural language and agentic AI in Amazon SageMaker Unified Studio	10:30 AM – 11:30 AM PST \| Caesars Forum \| Level 1 \| Summit 221 ANT331 \| Optimize Cost and Performance in Amazon OpenSearch Service
8:30 AM – 9:30 AM PST \| Mandalay Bay \| Level 2 South \| Reef C ANT347 \| Build a secure and regulated data foundation for AI	11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas A ANT217 \| Build data pipelines in minutes with the Amazon SageMaker Visual experience	9:00 AM – 10:00 AM PST \| Mandalay Bay \| Level 3 South \| South Seas H ANT319-R1 \| Optimizing Apache Spark workloads with AWS Analytics	12:30 PM – 1:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas A ANT346 \| Architectural blueprints for your lakehouse in Amazon SageMaker	.
10:00 AM – 11:00 AM PST \| Mandalay Bay \| Level 3 South \| South Seas A ANT420-R \| AI-driven scaling in Amazon Redshift Serverless	12:00 PM – 1:00 PM PST \| Caesars Forum \| Level 1 \| Alliance 305 ANT301-R2 \| Accelerating the shift from batch to real-time streaming	10:00 AM – 11:00 AM PST \| MGM \| Level 1 \| Boulevard 158 ANT321 \| Top 10 tips to improve query performance in Amazon Redshift	2:00 PM – 3:00 PM PST \| MGM \| Level 1 \| Room 101 ANT303 \| Implement data pipelines for analytics using Amazon SageMaker Unified Studio	.
10:30 AM – 11:30 AM PST \| Wynn \| Convention Promenade \| Latour 5 ANT302-R \| Accelerate GenAI-powered data discovery and sharing with SageMaker Catalog	1:00 PM – 2:00 PM PST \| Mandalay Bay \| Level 2 South \| Lagoon G ANT330-R \| Design and build Intelligent Observability with Amazon OpenSearch Service	10:00 AM – 11:00 AM PST \| Wynn \| Convention Promenade \| La Tache 2 ANT320 \| Strengthening security for Apache Spark workloads	2:00 PM – 3:00 PM PST \| Mandalay Bay \| Level 3 South \| South Seas J ANT322 \| Architectural patterns for real-time data analytics on AWS	.
11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas A ANT338 \| Bring unified analytics to your data warehouse with the lakehouse architecture	1:30 PM – 2:30 PM PST \| MGM \| Level 3 \| Premier 320 ANT325-R1 \| A deep dive into AI/ML development in SageMaker Unified Studio	11:30 AM – 12:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas C ANT332 \| Building high-quality data products for AI Agents	3:30 PM – 4:30 PM PST \| MGM \| Level 1 \| Room 101 ANT337 \| Breaking data silos with the lakehouse architecture	.
11:30 AM – 12:30 PM PST \| Wynn \| Convention Promenade \| Montrachet 1 BIZ323 \| Design AI-powered BI architectures for modern enterprises with Amazon Quick Suite	2:30 PM – 3:30 PM PST \| Mandalay Bay \| Level 2 South \| Lagoon G ANT420-R1 \| AI-driven scaling in Amazon Redshift Serverless	1:00 PM – 2:00 PM PST \| Mandalay Bay \| Level 3 South \| South Seas C ANT340 \| Deep dive into data processing in SageMaker Unified Studio	.	.
1:30 PM – 2:30 PM PST \| MGM \| Level 3 \| Room 353 ANT325-R \| A deep dive into AI/ML development in SageMaker Unified Studio	2:30 PM – 3:30 PM PST \| Mandalay Bay \| Lower Level North \| South Pacific B ANT341 \| Build trust in AI with end-to-end data lineage in Amazon SageMaker Catalog	2:30 PM – 3:30 PM PST \| MGM \| Level 3 \| Chairman’s 356 ANT345 \| Building secure and scalable lakehouses for the future	.	.
2:30 PM – 3:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas A ANT329 \| Build Advanced AI-powered Search with OpenSearch MCP and Vectors	2:30 PM – 3:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas C BIZ327 \| Bridge data silos to unlock complete insights with Amazon Quick Suite	2:30 PM – 3:30 PM PST \| Mandalay Bay \| Level 3 South \| South Seas J ANT413 \| Upgrade Amazon DataZone to Amazon SageMaker Catalog for analytics and AI	.	.
3:00 PM – 4:00 PM PST \| MGM \| Level 3 \| Premier 320 BIZ319 \| Beyond chatbots: Discover conversational AI in Amazon Quick Suite	3:00 PM – 4:00 PM PST \| Wynn \| Convention Promenade \| Latour 5 ANT421 \| Advanced Stream Processing with Apache Flink	4:00 PM – 5:00 PM PST \| MGM \| Level 3 \| Room 350 ANT324 \| Building Pipelines for Analytics, ML and AI in Amazon Sagemaker Unified Studio	.	.
4:00 PM – 5:00 PM PST \| MGM \| Level 3 \| Chairman’s 356 ANT422 \| Building Resilient Multi-Tenant Messaging with Amazon SQS	4:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 2 South \| Reef C ANT319-R \| Optimizing Apache Spark workloads with AWS Analytics	4:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 3 South \| South Seas C ANT323 \| Mastering materialized views: tips for fast, low-latency queries in Redshift	.	.
4:30 PM – 5:30 PM PST \| Caesars Forum \| Level 1 \| Alliance 305 ANT330-R1 \| Design and build Intelligent Observability with Amazon OpenSearch Service	5:30 PM – 6:30 PM PST \| MGM \| Level 3 \| Room 350 ANT326 \| Mastering data transformations with Amazon Athena
5:30 PM – 6:30 PM PST \| MGM \| Level 1 \| Boulevard 167 ANT316 \| Orchestrating with Apache Airflow, MWAA, and SageMaker Unified Studio

Monday, Dec 1

Tuesday, Dec 2

Wednesday, Dec 3

Thursday, Dec 4

Friday, Dec 5

8:30 AM – 9:30 AM PST | MGM | Level 1 | Boulevard 167

ANT301-R1 | Accelerating the shift from batch to real-time streaming

11:30 AM – 12:30 PM PST | Caesars Forum | Level 1 | Academy 411

ANT302-R1 | Accelerate GenAI-powered data discovery and sharing with SageMaker Catalog

9:00 AM – 10:00 AM PST | MGM | Level 3 | Room 353

ANT301-R | Accelerating the shift from batch to real-time streaming

11:30 AM – 12:30 PM PST | MGM | Level 3 | Room 353

ANT207 | Develop with natural language and agentic AI in Amazon SageMaker Unified Studio

10:30 AM – 11:30 AM PST | Caesars Forum | Level 1 | Summit 221

ANT331 | Optimize Cost and Performance in Amazon OpenSearch Service

8:30 AM – 9:30 AM PST | Mandalay Bay | Level 2 South | Reef C

ANT347 | Build a secure and regulated data foundation for AI

11:30 AM – 12:30 PM PST | Mandalay Bay | Level 3 South | South Seas A

ANT217 | Build data pipelines in minutes with the Amazon SageMaker Visual experience

9:00 AM – 10:00 AM PST | Mandalay Bay | Level 3 South | South Seas H

ANT319-R1 | Optimizing Apache Spark workloads with AWS Analytics

12:30 PM – 1:30 PM PST | Mandalay Bay | Level 3 South | South Seas A

ANT346 | Architectural blueprints for your lakehouse in Amazon SageMaker

10:00 AM – 11:00 AM PST | Mandalay Bay | Level 3 South | South Seas A

ANT420-R | AI-driven scaling in Amazon Redshift Serverless

12:00 PM – 1:00 PM PST | Caesars Forum | Level 1 | Alliance 305

ANT301-R2 | Accelerating the shift from batch to real-time streaming

10:00 AM – 11:00 AM PST | MGM | Level 1 | Boulevard 158

ANT321 | Top 10 tips to improve query performance in Amazon Redshift

2:00 PM – 3:00 PM PST | MGM | Level 1 | Room 101

ANT303 | Implement data pipelines for analytics using Amazon SageMaker Unified Studio

10:30 AM – 11:30 AM PST | Wynn | Convention Promenade | Latour 5

ANT302-R | Accelerate GenAI-powered data discovery and sharing with SageMaker Catalog

1:00 PM – 2:00 PM PST | Mandalay Bay | Level 2 South | Lagoon G

ANT330-R | Design and build Intelligent Observability with Amazon OpenSearch Service

10:00 AM – 11:00 AM PST | Wynn | Convention Promenade | La Tache 2

ANT320 | Strengthening security for Apache Spark workloads

2:00 PM – 3:00 PM PST | Mandalay Bay | Level 3 South | South Seas J

ANT322 | Architectural patterns for real-time data analytics on AWS

11:30 AM – 12:30 PM PST | Mandalay Bay | Level 3 South | South Seas A

ANT338 | Bring unified analytics to your data warehouse with the lakehouse architecture

1:30 PM – 2:30 PM PST | MGM | Level 3 | Premier 320

ANT325-R1 | A deep dive into AI/ML development in SageMaker Unified Studio

11:30 AM – 12:30 PM PST | Mandalay Bay | Level 3 South | South Seas C

ANT332 | Building high-quality data products for AI Agents

3:30 PM – 4:30 PM PST | MGM | Level 1 | Room 101

ANT337 | Breaking data silos with the lakehouse architecture

11:30 AM – 12:30 PM PST | Wynn | Convention Promenade | Montrachet 1

BIZ323 | Design AI-powered BI architectures for modern enterprises with Amazon Quick Suite

2:30 PM – 3:30 PM PST | Mandalay Bay | Level 2 South | Lagoon G

ANT420-R1 | AI-driven scaling in Amazon Redshift Serverless

1:00 PM – 2:00 PM PST | Mandalay Bay | Level 3 South | South Seas C

ANT340 | Deep dive into data processing in SageMaker Unified Studio

1:30 PM – 2:30 PM PST | MGM | Level 3 | Room 353

ANT325-R | A deep dive into AI/ML development in SageMaker Unified Studio

2:30 PM – 3:30 PM PST | Mandalay Bay | Lower Level North | South Pacific B

ANT341 | Build trust in AI with end-to-end data lineage in Amazon SageMaker Catalog

2:30 PM – 3:30 PM PST | MGM | Level 3 | Chairman’s 356

ANT345 | Building secure and scalable lakehouses for the future

2:30 PM – 3:30 PM PST | Mandalay Bay | Level 3 South | South Seas A

ANT329 | Build Advanced AI-powered Search with OpenSearch MCP and Vectors

2:30 PM – 3:30 PM PST | Mandalay Bay | Level 3 South | South Seas C

BIZ327 | Bridge data silos to unlock complete insights with Amazon Quick Suite

2:30 PM – 3:30 PM PST | Mandalay Bay | Level 3 South | South Seas J

ANT413 | Upgrade Amazon DataZone to Amazon SageMaker Catalog for analytics and AI

3:00 PM – 4:00 PM PST | MGM | Level 3 | Premier 320

BIZ319 | Beyond chatbots: Discover conversational AI in Amazon Quick Suite

3:00 PM – 4:00 PM PST | Wynn | Convention Promenade | Latour 5

ANT421 | Advanced Stream Processing with Apache Flink

4:00 PM – 5:00 PM PST | MGM | Level 3 | Room 350

ANT324 | Building Pipelines for Analytics, ML and AI in Amazon Sagemaker Unified Studio

4:00 PM – 5:00 PM PST | MGM | Level 3 | Chairman’s 356

ANT422 | Building Resilient Multi-Tenant Messaging with Amazon SQS

4:00 PM – 5:00 PM PST | Mandalay Bay | Level 2 South | Reef C

ANT319-R | Optimizing Apache Spark workloads with AWS Analytics

4:00 PM – 5:00 PM PST | Mandalay Bay | Level 3 South | South Seas C

ANT323 | Mastering materialized views: tips for fast, low-latency queries in Redshift

4:30 PM – 5:30 PM PST | Caesars Forum | Level 1 | Alliance 305

ANT330-R1 | Design and build Intelligent Observability with Amazon OpenSearch Service

5:30 PM – 6:30 PM PST | MGM | Level 3 | Room 350

ANT326 | Mastering data transformations with Amazon Athena

5:30 PM – 6:30 PM PST | MGM | Level 1 | Boulevard 167

ANT316 | Orchestrating with Apache Airflow, MWAA, and SageMaker Unified Studio

Builders’ sessions

Immerse yourself in our builders’ sessions – a hands-on learning experience designed to elevate your AWS skills. These focused, hour-long workshops bring together a small group of up to ten attendees with a dedicated AWS expert at each table.

Monday, Dec 1	Tuesday, Dec 2	Wednesday, Dec 3	Thursday, Dec 4
8:30 AM – 9:30 AM PST \| Wynn \| Convention Promenade \| Latour 7 ANT407-R1 \| Building event-driven applications with AWS Streaming and Messaging	11:30 AM – 12:30 PM PST \| MGM \| Level 1 \| Room 104 ANT415 \| Securely monetize your data with Amazon Redshift	1:00 PM – 2:00 PM PST \| Mandalay Bay \| Lower Level North \| Islander H ANT407-R \| Building event-driven applications with AWS Streaming and Messaging	12:30 PM – 1:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Builders’ Session 1 ANT409 \| Getting hands on with zero-ETL and data federation
11:30 AM – 12:30 AM PST \| MGM \| Level 1 \| Room 104 ANT410-R \| Integrate and orchestrate data workflows with AWS Glue & MWAA	2:30 PM – 3:30 PM PST \| MGM \| Level 3 \| Room 304 ANT405-R1 \| Build high performance Apache Iceberg data lakes with Amazon S3 Tables	1:00 PM – 2:00 PM PST \| Wynn \| Convention Promenade \| Latour 7 ANT406-R \| Build trust in your data with Amazon SageMaker Catalog	2:00 PM – 3:00 PM PST \| Mandalay Bay \| Lower Level North \| Islander H ANT419-R \| Vector search with Amazon OpenSearch Service
11:30 AM – 12:30 PM PST \| Wynn \| Convention Promenade \| Latour 7 ANT406-R1 \| Build trust in your data with Amazon SageMaker Catalog	4:30 PM – 5:30 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Builders’ Session 2 ANT410-R1 \| Integrate and orchestrate data workflows with AWS Glue & MWAA	4:00 PM – 5:00 PM PST \| MGM \| Level 3 \| Room 304 ANT419-R1 \| Vector search with Amazon OpenSearch Service	3:30 PM – 4:30 PM PST \| Caesars Forum \| Level 1 \| Alliance 315 OPN407-R1 \| Performance tuning for streaming Ingestion into Apache Iceberg
2:30 PM – 3:30 PM PST \| MGM \| Level 1 \| Room 104 ANT408 \| Data analytics for financial organizations with Amazon SageMaker	.	.	.
3:00 PM – 4:00 PM PST \| Caesars Forum \| Level 1 \| Alliance 311 OPN407-R \| Performance tuning for streaming Ingestion into Apache Iceberg	.	.	.
4:00 PM – 5:00 PM PST \| Mandalay Bay \| Lower Level North \| Islander H ANT405-R \| Build high performance Apache Iceberg data lakes with Amazon S3 Tables	.	.	.

Monday, Dec 1

Tuesday, Dec 2

Wednesday, Dec 3

Thursday, Dec 4

8:30 AM – 9:30 AM PST | Wynn | Convention Promenade | Latour 7

ANT407-R1 | Building event-driven applications with AWS Streaming and Messaging

11:30 AM – 12:30 PM PST | MGM | Level 1 | Room 104

ANT415 | Securely monetize your data with Amazon Redshift

1:00 PM – 2:00 PM PST | Mandalay Bay | Lower Level North | Islander H

ANT407-R | Building event-driven applications with AWS Streaming and Messaging

ANT409 | Getting hands on with zero-ETL and data federation

11:30 AM – 12:30 AM PST | MGM | Level 1 | Room 104

ANT410-R | Integrate and orchestrate data workflows with AWS Glue & MWAA

2:30 PM – 3:30 PM PST | MGM | Level 3 | Room 304

ANT405-R1 | Build high performance Apache Iceberg data lakes with Amazon S3 Tables

1:00 PM – 2:00 PM PST | Wynn | Convention Promenade | Latour 7

ANT406-R | Build trust in your data with Amazon SageMaker Catalog

2:00 PM – 3:00 PM PST | Mandalay Bay | Lower Level North | Islander H

ANT419-R | Vector search with Amazon OpenSearch Service

11:30 AM – 12:30 PM PST | Wynn | Convention Promenade | Latour 7

ANT406-R1 | Build trust in your data with Amazon SageMaker Catalog

ANT410-R1 | Integrate and orchestrate data workflows with AWS Glue & MWAA

4:00 PM – 5:00 PM PST | MGM | Level 3 | Room 304

ANT419-R1 | Vector search with Amazon OpenSearch Service

3:30 PM – 4:30 PM PST | Caesars Forum | Level 1 | Alliance 315

OPN407-R1 | Performance tuning for streaming Ingestion into Apache Iceberg

2:30 PM – 3:30 PM PST | MGM | Level 1 | Room 104

ANT408 | Data analytics for financial organizations with Amazon SageMaker

3:00 PM – 4:00 PM PST | Caesars Forum | Level 1 | Alliance 311

OPN407-R | Performance tuning for streaming Ingestion into Apache Iceberg

4:00 PM – 5:00 PM PST | Mandalay Bay | Lower Level North | Islander H

ANT405-R | Build high performance Apache Iceberg data lakes with Amazon S3 Tables

Workshops

Roll your sleeves in our dynamic 2-hour workshops, where you’ll tackle real-world challenges using AWS services. These interactive sessions kick off with a brief, informative lecture to set the stage, then quickly transition into hands-on problem-solving. Bring your laptop and prepare to build alongside AWS experts, who will guide you through practical applications of cloud computing concepts. Whether you’re new to AWS or looking to sharpen your skills, these workshops offer a unique opportunity to learn by doing, enabling you to leave with confidence and applicable knowledge in AWS technologies.

Monday, Dec 1	Tuesday, Dec 2	Wednesday, Dec 3	Thursday, Dec 4
8:00 AM – 10:00 AM PST \| Mandalay Bay \| Lower Level North \| Islander C ANT402-R1 \| Build a fraud detection system with Amazon SageMaker Unified Studio	12:00 PM – 2:00 PM PST \| MGM \| Level 3 \| Premier 317 ANT418 \| Unleash Apache Kafka’s elasticity and cost-efficiency with Amazon MSK	8:30 AM – 10:30 AM PST \| Mandalay Bay \| Lower Level North \| Islander C ANT402-R \| Build a fraud detection system with Amazon SageMaker Unified Studio	12:00 PM – 2:00 PM PST \| MGM \| Level 3 \| Premier 317 ANT412 \| Power streaming analytics on AWS with AI-driven insights
8:00 AM – 10:00 AM PST \| Mandalay Bay \| Level 2 South \| Mandalay Bay Ballroom H ANT403 \| Building Production-Ready Data Systems for AI Applications	12:30 PM – 2:30 PM PST \| MGM \| Level 3 \| Chairman’s 368 ANT404-R1 \| Build modern data applications with the lakehouse architecture on AWS	8:30 AM – 10:30 AM PST \| Caesars Forum \| Level 1 \| Alliance 308 BIZ204-R1 \| Experience AI-powered BI with Amazon Quick Suite	3:00 PM – 5:00 PM PST \| Mandalay Bay \| Lower Level North \| Islander C ANT416 \| Solve complex data and AI governance challenges with Amazon SageMaker Catalog
8:30 AM – 10:30 AM PST \| Wynn \| Upper Convention Promenade \| Cristal 3 BIZ306 \| Create agentic AI chat experiences with Amazon Quick Suite	3:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 2 South \| Mandalay Bay Ballroom K ANT411 \| Low-cost logging and observability with Amazon OpenSearch Service	12:30 PM – 2:30 PM PST \| MGM \| Level 1 \| Grand 113 ANT404-R \| Build modern data applications with the lakehouse architecture on AWS	.
12:00 PM – 2:00 PM PST \| MGM \| Level 3 \| Premier 317 ANT417 \| Simplifying data interoperability with the lakehouse architecture on AWS	3:00 PM – 5:00 PM PST \| Wynn \| Upper Convention Promenade \| Cristal 1 BIZ204-R \| Experience AI-powered BI with Amazon Quick Suite	3:30 PM – 5:30 PM PST \| Mandalay Bay \| Lower Level North \| Islander C ANT401 \| Build an AI-powered enterprise search with Amazon OpenSearch service	.
3:00 PM – 5:00 PM PST \| Mandalay Bay \| Level 2 South \| Mandalay Bay Ballroom K ANT414 \| Scale intelligent analytics with Amazon Redshift multi-cluster architectures	.	.	.

Monday, Dec 1

Tuesday, Dec 2

Wednesday, Dec 3

Thursday, Dec 4

8:00 AM – 10:00 AM PST | Mandalay Bay | Lower Level North | Islander C

ANT402-R1 | Build a fraud detection system with Amazon SageMaker Unified Studio

12:00 PM – 2:00 PM PST | MGM | Level 3 | Premier 317

ANT418 | Unleash Apache Kafka’s elasticity and cost-efficiency with Amazon MSK

8:30 AM – 10:30 AM PST | Mandalay Bay | Lower Level North | Islander C

ANT402-R | Build a fraud detection system with Amazon SageMaker Unified Studio

12:00 PM – 2:00 PM PST | MGM | Level 3 | Premier 317

ANT412 | Power streaming analytics on AWS with AI-driven insights

8:00 AM – 10:00 AM PST | Mandalay Bay | Level 2 South | Mandalay Bay Ballroom H

ANT403 | Building Production-Ready Data Systems for AI Applications

12:30 PM – 2:30 PM PST | MGM | Level 3 | Chairman’s 368

ANT404-R1 | Build modern data applications with the lakehouse architecture on AWS

8:30 AM – 10:30 AM PST | Caesars Forum | Level 1 | Alliance 308

BIZ204-R1 | Experience AI-powered BI with Amazon Quick Suite

3:00 PM – 5:00 PM PST | Mandalay Bay | Lower Level North | Islander C

ANT416 | Solve complex data and AI governance challenges with Amazon SageMaker Catalog

8:30 AM – 10:30 AM PST | Wynn | Upper Convention Promenade | Cristal 3

BIZ306 | Create agentic AI chat experiences with Amazon Quick Suite

3:00 PM – 5:00 PM PST | Mandalay Bay | Level 2 South | Mandalay Bay Ballroom K

ANT411 | Low-cost logging and observability with Amazon OpenSearch Service

12:30 PM – 2:30 PM PST | MGM | Level 1 | Grand 113

ANT404-R | Build modern data applications with the lakehouse architecture on AWS

12:00 PM – 2:00 PM PST | MGM | Level 3 | Premier 317

ANT417 | Simplifying data interoperability with the lakehouse architecture on AWS

3:00 PM – 5:00 PM PST | Wynn | Upper Convention Promenade | Cristal 1

BIZ204-R | Experience AI-powered BI with Amazon Quick Suite

3:30 PM – 5:30 PM PST | Mandalay Bay | Lower Level North | Islander C

ANT401 | Build an AI-powered enterprise search with Amazon OpenSearch service

3:00 PM – 5:00 PM PST | Mandalay Bay | Level 2 South | Mandalay Bay Ballroom K

ANT414 | Scale intelligent analytics with Amazon Redshift multi-cluster architectures

Lightning Talks

Located in the Expo Hall, each of these 20-minute theater presentations are dedicated to a specific customer story, service demo, or AWS Partner offering.

Monday, Dec 1	Tuesday, Dec 2	Wednesday, Dec 3	Thursday, Dec 4
5:00 PM – 5:20 PM PST \| Venetian \| Level 2 \| Hall B \| Expo \| Theater 4 ANT334 \| High-performance NLP & geospatial analysis with Redshift	.	3:00 PM – 3:20 PM PST \| Mandalay Bay \| Level 2 South \| Oceanside C \| Content Hub \| Lightning Theater ANT333 \| Fast-track to insights: AWS-SAP data strategy	12:30 PM – 12:50 PM PST \| Venetian \| Level 2 \| Hall B \| Expo \| Theater 3 ANT342 \| ITTI’s Cross-Company Data Mesh Blueprint with Amazon SageMaker
6:00 PM – 6:20 PM PST \| Venetian \| Level 2 \| Hall B \| Expo \| Theater 3 ANT348 \| Seamless data sharing in Amazon Redshift	.	.	.

Monday, Dec 1

Tuesday, Dec 2

Wednesday, Dec 3

Thursday, Dec 4

ANT334 | High-performance NLP & geospatial analysis with Redshift

ANT333 | Fast-track to insights: AWS-SAP data strategy

ANT342 | ITTI’s Cross-Company Data Mesh Blueprint with Amazon SageMaker

ANT348 | Seamless data sharing in Amazon Redshift

Conclusion

We hope this post acts as your go-to resource for navigating the AWS analytics track at re:Invent 2025. For staying in the know about the most recent trends and advancements in AWS Analytics, follow our LinkedIn page.

Noise

All posts by Sonu Kumar Singh

Apache Spark encryption performance improvement with Amazon EMR 7.9

Results observed

Cost analysis

Cost breakdown

Set up EMR benchmarking

Prerequisites

Deploy the baseline EMR cluster (without optimization)

Deploy the optimized EMR cluster (with encryption optimization)

Benchmark command parameters explained

Summarize the results

Testing configuration details

Clean up

Key findings

Conclusion

About the authors

Your guide to AWS Analytics at AWS re:Invent 2025

Data Innovation Talk

Breakouts

Chalk talks

Builders’ sessions

Workshops

Lightning Talks

Conclusion

About the authors

The collective thoughts of the interwebz