All posts by Anurag Srivastava

Best practices for migrating from Apache Airflow 2.x to Apache Airflow 3.x on Amazon MWAA

Post Syndicated from Anurag Srivastava original https://aws.amazon.com/blogs/big-data/best-practices-for-migrating-from-apache-airflow-2-x-to-apache-airflow-3-x-on-amazon-mwaa/

Apache Airflow 3.x on Amazon MWAA introduces architectural improvements such as API-based task execution that provides enhanced security and isolation. Other major updates include a redesigned UI for better user experience, scheduler-based backfills for improved performance, and support for Python 3.12. Unlike in-place minor Airflow version upgrades in Amazon MWAA, upgrading to Airflow 3 from Airflow 2 requires careful planning and execution through a migration approach due to fundamental breaking changes.

This migration presents an opportunity to embrace next-generation workflow orchestration capabilities while providing business continuity. However, it’s more than a simple upgrade. Organizations migrating to Airflow 3.x on Amazon MWAA must understand key breaking changes, including the removal of direct metadata database access from workers, deprecation of SubDAGs, changes to default scheduling behavior, and library dependency updates. This post provides best practices and a streamlined approach to successfully navigate this critical migration, providing minimal disruption to your mission-critical data pipelines while maximizing the enhanced capabilities of Airflow 3.

Understanding the migration process

The journey from Airflow 2.x to 3.x on Amazon MWAA introduces several fundamental changes that organizations must understand before beginning their migration. These changes affect core workflow operations and require careful planning to achieve a smooth transition.

You should be aware of the following breaking changes:

  • Removal of direct database access – A critical change in Airflow 3 is the removal of direct metadata database access from worker nodes. Tasks and custom operators must now communicate through the REST API instead of direct database connections. This architectural change affects code that previously accessed the metadata database directly through SQLAlchemy connections, requiring refactoring of existing DAGs and custom operators.
  • SubDAG deprecation – Airflow 3 removes the SubDAG construct in favor of TaskGroups, Assets, and Data Aware Scheduling. Organizations must refactor existing SubDAGs to one of the previously mentioned constructs.
  • Scheduling behavior changes – Two notable changes to default scheduling options require an impact analysis:
    • The default values for catchup_by_default and create_cron_data_intervals changed to False. This change affects DAGs that don’t explicitly set these options.
    • Airflow 3 removes several context variables, such as execution_date, tomorrow_ds, yesterday_ds, prev_ds, and next_ds. You must replace these variables with currently supported context variables.
  • Library and dependency changes – A significant number of libraries change in Airflow 3.x, requiring DAG code refactoring. Many previously included provider packages might need explicit addition to the requirements.txt file.
  • REST API changes – The REST API path changes from /api/v1 to /api/v2, affecting external integrations. For more information about using the Airflow REST API, see Creating a web server session token and calling the Apache Airflow REST API.
  • Authentication system – Although Airflow 3.0.1 and later versions default to SimpleAuthManager instead of Flask-AppBuilder, Amazon MWAA will continue using Flask-AppBuilder for Airflow 3.x. This means customers on Amazon MWAA will not see any authentication changes.

The migration requires creating a new environment rather than performing an in-place upgrade. Although this approach demands more planning and resources, it provides the advantage of maintaining your existing environment as a fallback option during the transition, facilitating business continuity throughout the migration process.

Pre-migration planning and assessment

Successful migration depends on thorough planning and assessment of your current environment. This phase establishes the foundation for a smooth transition by identifying dependencies, configurations, and potential compatibility issues. Evaluate your environment and code against the previously mentioned breaking changes to have a successful migration.

Environment assessment

Begin by conducting a complete inventory of your current Amazon MWAA environment. Document all DAGs, custom operators, plugins, and dependencies, including their specific versions and configurations. Make sure your current environment is on version 2.10.x, because this provides the best compatibility path for upgrading to Amazon MWAA with Airflow 3.x.

Identify the structure of the Amazon Simple Storage Service (Amazon S3) bucket containing your DAG code, requirements file, startup script, and plugins. You will replicate this structure in a new bucket for the new environment. Creating separate buckets for each environment avoids conflicts and allows continued development without affecting current pipelines.

Configuration documentation

Document all custom Amazon MWAA environment variables, Airflow connections, and environment configurations. Review AWS Identity and Access Management (IAM) resources, because your new environment’s execution role will need identical policies. IAM users or roles accessing the Airflow UI require the CreateWebLoginToken permission for the new environment.

Pipeline dependencies

Understanding pipeline dependencies is critical for a successful phased migration. Identify interdependencies through Datasets (now Assets), SubDAGs, TriggerDagRun operators, or external API interactions. Develop your migration plan around these dependencies so related DAGs can migrate at the same time.

Consider DAG scheduling frequency when planning migration waves. DAGs with longer intervals between runs provide larger migration windows and lower risk of duplicate execution compared with frequently running DAGs.

Testing strategy

Create your testing strategy by defining a systematic approach to identifying compatibility issues. Use the ruff linter with the AIR30 ruleset to automatically identify code requiring updates:

ruff check --preview --select AIR30 <path_to_your_dag_code>

Then, review and update your environment’s requirements.txt file to make sure package versions comply with the updated constraints file. Additionally, commonly used Operators previously included in the airflow-core package now reside in a separate package and need to be added to your requirements file.

Test your DAGs using the Amazon MWAA Docker images for Airflow 3.x. These images make it possible to create and test your requirements file, and confirm the Scheduler successfully parses your DAGs.

Migration strategy and best practices

A methodical migration approach minimizes risk while providing clear validation checkpoints. The recommended strategy employs a phased blue/green deployment model that provides reliable migrations and immediate rollback capabilities.

Phased migration approach

The following migration phases can assist you in defining your migration plan:

  • Phase 1: Discovery, assessment, and planning – In this phase, complete your environment inventory, dependency mapping, and breaking change analysis. With the gathered information, develop the detailed migration plan. This plan will include steps for updating code, updating your requirements file, creating a test environment, testing, creating the blue/green environment (discussed later in this post), and the migration steps. Planning must also include the training, monitoring strategy, rollback conditions, and the rollback plan.
  • Phase 2: Pilot migration – The pilot migration phase serves to validate your detailed migration plan in a controlled environment with a small range of impact. Focus the pilot on two or three non-critical DAGs with diverse characteristics, such as different schedules and dependencies. Migrate the selected DAGs using the migration plan defined in the previous phase. Use this phase to validate your plan and monitoring tools, and adjust both based on actual results. During the pilot, establish baseline migration metrics to help predict the performance of the full migration.
  • Phase 3: Wave-based production migration – After a successful pilot, you are ready to begin the full wave-based migration for the remaining DAGs. Group remaining DAGs into logical waves based on business criticality (least critical first), technical complexity, interdependencies (migrate dependent DAGs together), and scheduling frequency (less frequent DAGs provide larger migration windows). After you define the waves, work with stakeholders to develop the wave schedule. Include sufficient validation periods between waves to confirm the wave is successful before starting the next wave. This time also reduces the range of impact in the event of a migration issue, and provides sufficient time to perform a rollback.
  • Phase 4: Post-migration review and decommissioning – After all waves are complete, conduct a post-migration review to identify lessons learned, optimization opportunities, and any other unresolved items. This is also a good time to provide an approval on system stability. The final step is decommissioning the original Airflow 2.x environment. After stability is determined, based on business requirements and input, decommission the original (blue) environment.

Blue/green deployment strategy

Implement a blue/green deployment strategy for safe, reversible migration. With this strategy, you will have two Amazon MWAA environments operating during the migration and manage which DAGs operate in which environment.

The blue environment (current Airflow 2.x) maintains production workloads during transition. You can implement a freeze window for DAG changes before migration to avoid last-minute code conflicts. This environment serves as the immediate rollback environment if an issue is identified in the new (green) environment.

The green environment (new Airflow 3.x) receives migrated DAGs in controlled waves. It mirrors the networking, IAM roles, and security configurations from the blue environment. Configure this environment with the same options as the blue environment, and create identical monitoring mechanisms so both environments can be monitored simultaneously. To avoid duplicate DAG runs, make sure a DAG only runs in a single environment. This involves pausing the DAG in the blue environment before activating the DAG in the green environment.Maintain the blue environment in warm standby mode during the entire migration. Document specific rollback steps for each migration wave, and test your rollback procedure for at least one non-critical DAG. Additionally, define clear criteria for triggering the rollback (such as specific failure rates or SLA violations).

Step-by-step migration process

This section provides detailed steps for conducting the migration.

Pre-migration assessment and preparation

Before initiating the migration process, conduct a thorough assessment of your current environment and develop the migration plan:

  • Make sure your current Amazon MWAA environment is on version 2.10.x
  • Create a detailed inventory of your DAGs, custom operators, and plugins including their dependencies and versions
  • Review your current requirements.txt file to understand package requirements
  • Document all environment variables, connections, and configuration settings
  • Review the Apache Airflow 3.x release notes to understand breaking changes
  • Determine your migration success criteria, rollback conditions, and rollback plan
  • Identify a small number of DAGs suitable for the pilot migration
  • Develop a plan to train, or familiarize, Amazon MWAA users on Airflow 3

Compatibility checks

Identifying compatibility issues is critical to a successful migration. This step helps developers focus on specific code that is incompatible with Airflow 3.

Use the ruff linter with the AIR30 ruleset to automatically identify code requiring updates:

ruff check --preview --select AIR30 <path_to_your_dag_code>

Additionally, review your code for instances of direct metadatabase access.

DAG code updates

Based on your findings during compatibility testing, update the affected DAG code for Airflow 3.x. The ruff DAG check utility can automatically fix common changes. Use the following command to run the utility in update mode:

ruff check dag/ --select AIR301 --fix –preview

Common changes include:

  • Replace direct metadata database access with API calls:
    # Before (Airflow 2.x) - Direct DB access
    from airflow.settings import Session
    from airflow.models.taskInstance import TaskInstance
    session=Session()
    result=session.query(TaskInstance)
    
    For Apache Airflow v3.x, utilize  in the Amazon MWAA SDK.
    Update core construct imports with the new Airflow SDK namespace:
    # Before (Airflow 2.x)
    from airflow.decorators import dag, task
    
    # After (Airflow 3.x)
    from airflow.sdk import dag, task

  • Replace deprecated context variables with their modern equivalents:
    # Before (Airflow 2.x)
    def my_task(execution_date, **context):
        # Using execution_date
    
    # After (Airflow 3.x)
    def my_task(logical_date, **context):
        # Using logical_date

Next, evaluate the usage of the two scheduling-related default changes. catchup_by_default is now False, meaning missing DAG runs will no longer automatically backfill. If backfill is required, update the DAG definition with catchup=True. If your DAGs require backfill, you must consider the impact of this migration and backfilling. Because you’re migrating a DAG to a clean environment with no history, enabling backfilling will create DAG runs for all runs beginning with the specified start_date. Consider updating the start_date to avoid unnecessary runs.

create_cron_data_intervals is also now False. With this change, cron expressions are evaluated as a CronTriggerTimetable construct.

Finally, evaluate the usage of deprecated context variables for manually and Asset-triggered DAGs, then update your code with suitable replacements.

Updating requirements and testing

In addition to possible package version changes, several core Airflow operators previously included in the airflow-core package moved to the apache-airflow-providers-standard package. These changes must be incorporated into your requirements.txt file. Specifying, or pinning, package versions in your requirements file is a best practice and recommended for this migration.To update your requirements file, complete the following steps:

  1. Download and configure the Amazon MWAA Docker images. For more details, refer to the GitHub repo.
  2. Copy the current environment’s requirements.txt file to a new file.
  3. If needed, add the apache-airflow-providers-standard package to the new requirements file.
  4. Download the appropriate Airflow constraints file for your target Airflow version to your working director. A constraints file is available for each Airflow version and Python version combination. The URL takes the following form:
    https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt
  5. Create your versioned requirements file using your un-versioned file and the constraints file. For guidance on creating a requirements file, see Creating a requirements.txt file. Make sure there are no dependency conflicts before moving forward.
  6. Verify your requirements file using the Docker image. Run the following command inside the running container:
    ./run.sh test-requirements

    Address any installation errors by updating package versions.

As a best practice, we recommend packaging your packages into a ZIP file for deployment in Amazon MWAA. This makes sure the same exact packages are installed on all Airflow nodes. Refer to Installing Python dependencies using PyPi.org Requirements File Format for detailed information about packaging dependencies.

Creating a new Amazon MWAA 3.x environment

Because Amazon MWAA requires a migration approach for major version upgrades, you must create a new environment for your blue/green deployment. This post uses the AWS Command Line Interface (AWS CLI) as an example, you can also use infrastructure as code (IaC).

  1. Create a new S3 bucket using the same structure as the current S3 bucket.
  2. Upload the updated requirements file and any plugin packages to the new S3 bucket.
  3. Generate a template for your new environment configuration:
    aws mwaa create-environment --generate-cli-skeleton > new-mwaa3-env.json

  4. Modify the generated JSON file:
    1. Copy configurations from your existing environment.
    2. Update the environment name.
    3. Set the AirflowVersion parameter to the target 3.x version.
    4. Update the S3 bucket properties with the new S3 bucket name.
    5. Review and update other configuration parameters as needed.

    Configure the new environment with the same networking settings, security groups, and IAM roles as your existing environment. Refer to the Amazon MWAA User Guide for these configurations.

  5. Create your new environment:
    aws mwaa create-environment --cli-input-json file://new-mwaa3-env.json

Metadata migration

Your new environment requires the same variables, connections, roles, and pool configurations. Use this section as a guide for migrating this information. If you’re using AWS Secrets Manager as your secrets backend, you don’t need to migrate any connections. Depending your environment’s size, you can migrate this metadata using the Airflow UI or the Apache Airflow REST API.

  1. Update any custom pool information in the new environment using the Airflow UI.
  2. For environments using the metadatabase as a secrets backend, migrate all connections to the new environment.
  3. Migrate all variables to the new environment.
  4. Migrate any custom Airflow roles to the new environment.

Migration execution and validation

Plan and execute the transition from your old environment to the new one:

  1. Schedule the migration during a period of low workflow activity to minimize disruption.
  2. Implement a freeze window for DAG changes before and during the migration.
  3. Execute the migration in phases:
    1. Pause DAGs in the old environment. For a small number of DAGs, you can use the Airflow UI. For larger groups, consider using the REST API.
    2. Verify all running tasks have completed in the Airflow UI.
    3. Redirect DAG triggers and external integrations to the new environment.
    4. Copy the updated DAGs to the new environment’s S3 bucket.
    5. Enable DAGs in the new environment. For a small number of DAGs, you can use the Airflow UI. For larger groups, consider using the REST API.
  4. Monitor the new environment closely during the initial operation period:
    1. Watch for failed tasks or scheduling issues.
    2. Check for missing variables or connections.
    3. Verify external system integrations are functioning correctly.
    4. Monitor Amazon CloudWatch metrics to confirm the environment is performing as expected.

Post-migration validation

After the migration, thoroughly validate the new environment:

  • Verify that all DAGs are being scheduled correctly according to their defined schedules
  • Check that task history and logs are accessible and complete
  • Test critical workflows end-to-end to confirm they execute successfully
  • Validate connections to external systems are functioning properly
  • Monitor CloudWatch metrics for performance validation

Cleanup and documentation

When the migration is complete and the new environment is stable, complete the following steps:

  1. Document the changes made during the migration process.
  2. Update runbooks and operational procedures to reflect the new environment.
  3. After a sufficient stability period, defined by stakeholders, decommission the old environment:
    aws mwaa delete-environment --name old-mwaa2-env

  4. Archive backup data according to your organization’s retention policies.

Conclusion

The journey from Airflow 2.x to 3.x on Amazon MWAA is an opportunity to embrace next-generation workflow orchestration capabilities while maintaining the reliability of your workflow operations. By following these best practices and maintaining a methodical approach, you can successfully navigate this transition while minimizing risks and disruptions to your business operations.

A successful migration requires thorough preparation, systematic testing, and maintaining clear documentation throughout the process. Although the migration approach requires more initial effort, it provides the safety and control needed for such a significant upgrade.


About the authors

Anurag Srivastava

Anurag Srivastava

Anurag works as a Senior Technical Account Manager at AWS, specializing in Amazon MWAA. He’s passionate about helping customers build scalable data pipelines and workflow automation solutions on AWS.

Kamen Sharlandjiev

Kamen Sharlandjiev

Kamen is a Sr. Big Data and ETL Solutions Architect, Amazon MWAA and AWS Glue ETL expert. He’s on a mission to make life easier for customers who are facing complex data integration and orchestration challenges. His secret weapon? Fully managed AWS services that can get the job done with minimal effort. Follow Kamen on LinkedIn to keep up to date with the latest Amazon MWAA and AWS Glue features and news!

Ankit Sahu

Ankit Sahu

Ankit brings over 18 years of expertise in building innovative digital products and services. His diverse experience spans product strategy, go-to-market execution, and digital transformation initiatives. Currently, Ankit serves as Senior Product Manager at Amazon Web Services (AWS), where he leads the Amazon MWAA service.

Jeetendra Vaidya

Jeetendra Vaidya

Jeetendra is a Senior Solutions Architect at AWS, bringing his expertise to the realms of AI/ML, serverless, and data analytics domains. He is passionate about assisting customers in architecting secure, scalable, reliable, and cost-effective solutions.

Mike Ellis

Mike Ellis

Mike is a Senior Technical Account Manager at AWS and an Amazon MWAA specialist. In addition to assisting customers with Amazon MWAA, he contributes to the Airflow open source project.

Venu Thangalapally

Venu Thangalapally

Venu is a Senior Solutions Architect at AWS, based in Chicago, with deep expertise in cloud architecture, data and analytics, containers, and application modernization. He partners with financial service industry customers to translate business goals into secure, scalable, and compliant cloud solutions that deliver measurable value. Venu is passionate about using technology to drive innovation and operational excellence. Outside of work, he enjoys spending time with his family, reading, and taking long walks.

Introducing Apache Airflow 3 on Amazon MWAA: New features and capabilities

Post Syndicated from Anurag Srivastava original https://aws.amazon.com/blogs/big-data/introducing-apache-airflow-3-on-amazon-mwaa-new-features-and-capabilities/

Today, Amazon Web Services (AWS) announced the general availability of Apache Airflow 3 on Amazon Managed Workflows for Apache Airflow (Amazon MWAA). This release transforms how organizations use Apache Airflow to orchestrate data pipelines and business processes in the cloud, bringing enhanced security, improved performance, and modern workflow orchestration capabilities to Amazon MWAA customers.

Amazon MWAA introduces Airflow 3 features that modernize workflow management for AWS customers. Following the April 2025 release of Airflow 3 by the Apache community, AWS has incorporated these capabilities into Amazon MWAA. Airflow now features a completely redesigned, intuitive UI that simplifies workflow orchestration for users across experience levels. With the Task Execution Interface (Task API), tasks can run both within Airflow and as standalone Python scripts, improving code portability and testing. Scheduler-managed Backfill moves operations from the CLI to the scheduler, providing centralized control and visibility through the Airflow UI. CLI security improvements replace direct database access with API calls, maintaining consistent security across interfaces. Airflow now supports event-driven workflows, enabling triggers from AWS services and external sources. Amazon MWAA also adds support for Python 3.12, bringing the latest language capabilities to workflow development.

This post explores the features of Airflow 3 on Amazon MWAA and outlines enhancements that improve your workflow orchestration capabilities. The service maintains the Amazon MWAA pay-as-you-go pricing model with no upfront commitments. You can begin immediately by visiting the Amazon MWAA console, launching new Apache Airflow environments through the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS CloudFormation, or AWS SDK within minutes.

Architectural advancements in Airflow 3 on Amazon MWAA

Airflow 3 on Amazon MWAA introduces significant architectural improvements that enhance security, performance, and flexibility. These advancements create a more robust foundation for workflow orchestration while maintaining backward compatibility with existing workflows.

Enhanced security

Amazon MWAA with Airflow 3 changes the security model by making component isolation a standard practice rather than optional. In Airflow 2, the DAG processor (the component that parses and processes DAG files) runs within the scheduler process by default, but can optionally be separated into its own process for better scalability and security isolation. Airflow 3 makes this separation standard, maintaining consistent security practices across deployments.

API server and Task API

Building on this security foundation, a new API server component is introduced in Amazon MWAA with Airflow 3, which serves as an intermediary between task instances and the Airflow metadata database. This change improves your workflows’ security posture by minimizing direct access to the Airflow metadata database from tasks. Tasks now operate with least privilege database access, reducing the risk of one task affecting others and improving overall system stability through fewer direct database connections.

The standardized communication through well-defined API endpoints creates a foundation for more secure, scalable, and flexible workflow orchestration. The Task Execution Interface (Task API) helps tasks run both within Airflow and as standalone Python scripts, improving code portability and testing capabilities.

From data-aware to event-driven scheduling

Airflow’s evolution toward event-driven scheduling began with the introduction of data-aware scheduling in Airflow 2.4, so DAGs could be triggered based on data availability rather than time schedules alone. Amazon MWAA with Airflow 3 builds on this foundation through a transition that includes the renaming of datasets to assets and introduces advanced capabilities, including asset partitions, external event integration, and asset-centric workflow design.

The transition from datasets to assets represents more than a simple rename. A data asset is a collection of logically related data that can represent diverse data products, including database tables, persisted ML models, embedded dashboards, or directories containing files.

Amazon MWAA with Airflow 3 introduces a new asset-centric syntax that represents an important shift in how workflows can be designed. The @asset decorator helps developers put data assets at the center of their workflow design, creating more intuitive asset-driven pipelines.

The following code is an example of asset-aware DAG scheduling:

from airflow.sdk import DAG, Asset
from airflow.providers.standard.operators.python import PythonOperator

# Define the asset
customer_data_asset = Asset(name="customer_data", uri="s3://my-bucket/customer-data.csv")

def process_customer_data():
    """Process customer data..."""
    # Implementation here

# Create the DAG and task
with DAG(dag_id="process_customer_data", schedule="@daily"):
    PythonOperator(
        task_id="process_data", 
        outlets=[customer_data_asset], 
        python_callable=process_customer_data
    )

The following code shows an asset-centric approach with the @asset decorator:

from airflow.sdk import asset

@asset(uri="s3://my-bucket/customer-data.csv", schedule="@daily")
def customer_data():
    """Process customer data..."""
    # Implementation here

The @asset decorator automatically creates an asset with the function name, a DAG with the same identifier, and a task that produces the asset. This reduces code complexity and facilitates automatic DAG creation, where each asset becomes a self-contained workflow unit.

External event-driven scheduling with Asset Watchers

A significant advancement in Amazon MWAA with Airflow 3 is the introduction of Asset Watchers, which help Airflow react to events happening outside of the Airflow system itself. Whereas previous versions supported internal cross-DAG dependencies, Asset Watchers extend this capability to external data systems and message queues through the AssetWatcher class.

Amazon MWAA with Airflow 3 includes support for Amazon Simple Queue Service (Amazon SQS) through Asset Watchers. This allows your workflows to be triggered by external messages and facilitates more event-driven scheduling. Airflow now supports event-driven workflows, enabling triggers from AWS services and external sources. Asset Watchers monitor external systems asynchronously and trigger workflow execution when specific events occur, enabling workflows to respond to business events, data updates, or system notifications without the overhead of traditional sensor-based polling mechanisms.

Modern React-based UI

Amazon MWAA with Airflow 3 features a completely redesigned, intuitive UI built with React and FastAPI that simplifies workflow orchestration for users across experience levels. The new interface provides more intuitive navigation and workflow visualization, with an enhanced grid view that offers better visibility into task status and history. Users will appreciate the addition of dark mode support, which reduces eye strain during extended use, and the overall faster performance that’s especially noticeable when working with large DAGs.

The new UI maintains familiar workflows while providing a more modern and efficient experience for DAG management and monitoring, making daily operations more productive for both developers and operators. The legacy UI has been completely removed, offering a cleaner, more consistent experience across the system. The foundation for the new UI is built on REST APIs and a set of internal APIs for UI operations, both of which are now based on FastAPI, creating a more cohesive and secure architecture for both programmatic access and UI operations.

Scheduler optimizations

Amazon MWAA with Airflow 3’s enhanced scheduler delivers performance improvements for task execution and workflow management. The redesigned scheduling engine processes tasks more efficiently, reducing the time between task submissions and executions. This optimization benefits data pipeline operations that require rapid task processing and timely workflow completion.

The scheduler now manages computing resources more effectively, enabling stable performance even as workloads scale. When running multiple DAGs simultaneously, the improved resource allocation system helps prevent bottlenecks and maintains consistent execution speeds. This advancement is particularly useful for organizations running complex workflows with varying resource requirements. The new scheduler also handles concurrent operations with increased precision, so teams can run multiple DAG instances simultaneously while maintaining system stability and predictable performance.

Enhanced scheduler backfill operations

Scheduler-managed backfill (the process of running DAGs for historical dates) moves operations from the CLI to the scheduler, providing centralized control and visibility through the Airflow UI. Amazon MWAA with Airflow 3 delivers important upgrades to the scheduler’s backfill capabilities, helping data teams process historical data more efficiently. The backfill process has been optimized for better performance, reducing the database load during these operations and making sure backfills can be completed more quickly, minimizing the impact on near real-time workflow execution.

Amazon MWAA with Airflow 3 also improves the management of backfill operations, with the scheduler providing better isolation between backfill jobs and supporting more efficient processing of historical datasets. Operators now have better monitoring tools to track the progress and status of their backfill jobs, resulting in more effective management of these critical data processing tasks.

Developer-focused improvements

Airflow 3 on Amazon MWAA delivers several enhancements designed to improve the developer experience, from simplified task definition to better workflow management capabilities.

Task SDK

The Task SDK provides a more intuitive way to define tasks and DAGs:

# Example using the Task SDK
from airflow.sdk import dag, task
from datetime import datetime

@dag(
    start_date=datetime(2023, 1, 1),
    schedule="@daily",
    catchup=False
)
def modern_etl_workflow():
    
    @task
    def extract():
        # Extract data from source
        return {"data": [1, 2, 3, 4, 5]}
    
    @task
    def transform(input_data):
        # Transform the data
        return [x * 10 for x in input_data]
    
    @task
    def load(transformed_data):
        # Load data to destination
        print(f"Loading data: {transformed_data}")
    
    # Define the workflow
    extracted_data = extract()
    transformed_data = transform(extracted_data["data"])
    load(transformed_data)

# Instantiate the DAG
etl_dag = modern_etl_workflow()

This approach offers more intuitive data flow between tasks, better integrated development environment (IDE) support with improved type hinting, and more straightforward unit testing of task logic. The result is cleaner, more maintainable code that better represents the actual data flow of your pipelines. Teams adopting this pattern often find their DAGs become more readable and simpler to maintain over time, especially as workflows grow in complexity.

DAG versioning

Amazon MWAA with Airflow 3 includes basic DAG versioning capabilities that come by default with Airflow 3. Each time a DAG is modified and deployed, Airflow serializes and stores the DAG definition to preserve history. This automatic version tracking minimizes the need for manual record-keeping and ensures every modification is documented.

Through the Airflow UI, teams can access and review the history of their DAGs. This visual representation shows version numbers (v1, v2, v3, etc.) and helps teams understand how their workflows have evolved over time.

The DAG versioning supported in Amazon MWAA provides the capability to see different DAG versions that were run in the Airflow UI, offering improved workflow visibility and enhanced collaboration for data engineering teams managing complex, evolving data pipelines.

Python 3.12 support

Amazon MWAA adds support for Python 3.12, bringing the latest language capabilities to workflow development. This upgrade provides access to the latest Python language improvements, performance enhancements, and library updates, keeping your data pipelines modern and efficient.

Features not currently supported in Amazon MWAA

Although we are launching most of the Airflow 3 features on Amazon MWAA in this release, some features are not supported at this time:

  • DAG versioning (AIP-63) – Advanced versioning features beyond basic version tracking
  • Replace Flask AppBuilder (AIP-79) – Full replacement capabilities
  • Edge Executor and task isolations (AIP-69) – Remote execution capabilities
  • Multi-language support (AIP-72) – Support for languages other than Python

We plan to support these features in subsequent versions of Airflow on Amazon MWAA.

Conclusion

Airflow 3 on Amazon MWAA delivers enhanced workflow automation capabilities. The architectural improvements, enhanced security model, and developer-friendly features provide a solid foundation for building more reliable and maintainable data pipelines.The introduction of Asset Watchers changes how workflows can respond to external events, enabling truly event-driven scheduling. This capability, combined with the new asset-centric workflow design, makes Airflow 3 a more powerful and flexible orchestration service.

The scheduler optimizations deliver performance improvements for task execution and workflow management, and the enhanced backfill capabilities make historical data processing more efficient. The DAG versioning system improves workflow stability and collaboration, and Python 3.12 support keeps your data pipelines modern and efficient.

Organizations can now take advantage of these new features and improvements in Airflow 3 on Amazon MWAA to enhance their workflow orchestration capabilities. To get started, visit the Amazon MWAA product page.


About the authors

Anurag Srivastava works as a Senior Big Data Cloud Engineer at Amazon Web Services (AWS), specializing in Amazon MWAA. He’s passionate about helping customers build scalable data pipelines and workflow automation solutions on AWS.

Kamen Sharlandjiev is a Sr. Big Data and ETL Solutions Architect, Amazon MWAA and AWS Glue ETL expert. He’s on a mission to make life easier for customers who are facing complex data integration and orchestration challenges. His secret weapon? Fully managed AWS services that can get the job done with minimal effort. Follow Kamen on LinkedIn to keep up to date with the latest Amazon MWAA and AWS Glue features and news!

Ankit Sahu brings over 18 years of expertise in building innovative digital products and services. His diverse experience spans product strategy, go-to-market execution, and digital transformation initiatives. Currently, Ankit serves as Senior Product Manager at Amazon Web Services (AWS), where he leads the Amazon MWAA service.

Mohammad Sabeel works as a Senior Cloud Support Engineer at Amazon Web Services (AWS), specializing in AWS Analytics services including AWS Glue, Amazon MWAA, and Amazon Athena. With over 14 years of IT experience, he’s passionate about helping customers build scalable data processing pipelines and optimize their analytics solutions on AWS.

Satya Chikkala is a Solutions Architect at Amazon Web Services. Based in Melbourne, Australia, he works closely with enterprise customers to accelerate their cloud journey. Beyond work, he is very passionate about nature and photography.

Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backward from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data system transformations across industry verticals. His core area of expertise include technology strategy, data analytics, and data science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.

Best practices for upgrading Amazon MWAA environments

Post Syndicated from Anurag Srivastava original https://aws.amazon.com/blogs/big-data/best-practices-for-upgrading-amazon-mwaa-environments/

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) has become a cornerstone for organizations embracing data-driven decision-making. As a scalable solution for managing complex data pipelines, Amazon MWAA enables seamless orchestration across AWS services and on-premises systems. Although AWS manages the underlying infrastructure, you must carefully plan and execute your Amazon MWAA environment updates according to the shared responsibility model. Upgrading to the latest Amazon MWAA version can provide significant advantages, including enhanced security through critical security patches and potential improvements in performance with faster DAG parsing and reduced database load. You can use advanced features while maintaining ecosystem compatibility and receiving prioritized AWS support. The key to successful upgrades lies in choosing the right solution and following a methodical implementation approach.

In this post, we explore best practices for upgrading your Amazon MWAA environment and provide a step-by-step guide to seamlessly transition to the latest version.

Solution overview

Amazon MWAA provides two primary upgrade solutions:

  • In-place upgrade – This method works best when you can accommodate planned downtime. You deploy the new version directly on your existing infrastructure. In-place version upgrades on Amazon MWAA are supported for environments running Apache Airflow version 2.x and later. However, if you’re running version 1.10.z or older versions, you must create a new environment and migrate your resources, because these versions don’t support in-place upgrades.
  • Cutover upgrade – This method helps minimize disruption to production environments. You create a new Amazon MWAA environment with the target version and then transition from your old environment to the new one.

Each solution offers a different approach to help you upgrade while working to maintain data integrity and system reliability.

In-place upgrade

In-place upgrades work well for environments where you can schedule a maintenance window for the upgrade process. During this window, Amazon MWAA preserves your workflow history. This method works best when you can accommodate planned downtime. It helps maintain historical data, provides a straightforward upgrade process, and includes rollback capabilities if issues occur during provisioning. You also use fewer resources because you don’t need to create a new environment.

You can perform in-place upgrades through the AWS Management Console with a single operation. This process helps reduce operational overhead by managing many upgrade steps for you.

During the upgrade process, your environment can’t schedule or run new tasks. Amazon MWAA helps manage the upgrade process and implements safety measures—if issues occur during the provisioning phase, the service attempts to revert to the previous stable version.

Before you begin an in-place upgrade, we recommend testing your DAGs for compatibility with the target version, because DAG compatibility issues can affect the upgrade process. You can use the Amazon MWAA local runner to test DAG compatibility before you start the upgrade. You can start the upgrade using either the console and specifying the new version or the AWS Command Line Interface (AWS CLI). The following is an example Amazon MWAA upgrade command using the AWS CLI:

aws mwaa update-environment --name <value> --airflow-version <value>

The following diagram shows the Amazon MWAA in-place upgrade workflow and states.

In-place upgrade workflow and states

Refer to Introducing in-place version upgrades with Amazon MWAA for more details.

Cutover upgrade

A cutover upgrade provides an alternative solution when you need to minimize downtime, though it requires more manual steps and operational planning. With this approach, you create a new Amazon MWAA environment, migrate your metadata, and manage the transition between environments. Although this method offers more control over the upgrade process, it requires additional planning and execution effort compared to an in-place upgrade.

This method can work well for environments with complex workflows, particularly when you plan to make significant changes alongside the version upgrade. The approach offers several benefits: you can minimize production downtime, perform comprehensive testing before switching environments, and maintain the ability to return to your original environment if needed. You can also review and update your configurations during the transition.

Consider the following aspects of the cutover approach. When you run two environments simultaneously, you pay for both environments. The pricing for each Amazon MWAA environment depends on:

  • Duration of environment uptime (billed hourly with per-second resolution)
  • Environment size configuration
  • Automatic scaling capacity for workers
  • Scheduler capacity

AWS calculates the cost of additional automatic scaled workers separately. You can estimate costs for your specific configuration using the AWS Pricing Calculator.

To help prevent data duplication or corruption during parallel operation, we recommend implementing idempotent DAGs. The Airflow scheduler automatically populates some metadata tables (dag, dag_tag, and dag_code) in your new environment. However, you need to plan the migration of the following additional metadata components:

  • DAG history
  • Variables
  • Slot pool configurations
  • SLA miss records
  • XCom data
  • Job records
  • Log tables

You can choose this approach when your requirements prioritize minimal downtime and you can manage the additional operational complexity.

The cutover upgrade process involves three main steps: creating a new environment, restoring it with the existing data, and performing the upgrade. The following diagram illustrates the full workflow.

Cut-over upgrade steps

In the following sections, we walk through the key steps to perform a cutover upgrade.

Prerequisites

Before you begin the upgrade process, complete the following steps:

Create a new environment

Complete the following steps to create a new environment:

  • Generate a template for your new environment configuration using the AWS CLI:

aws mwaa create-environment --generate-cli-skeleton > <new-env-name>.json

  • Modify the generated JSON file:
    • Copy configurations from your backup file <env-name>.json to <new-env-name>.json.
    • Update the environment name.
    • Keep the AirflowVersion parameter value from your existing environment.
    • Review and update other configuration parameters as needed.
  • Create your new environment:

aws mwaa create-environment --cli-input-json <content of new-env-name.json>

Restore the new environment

Complete the following steps to restore the new environment:

  • Use the mwaa-dr PyPI package to create and run the restore DAG.
  • This process copies metadata from your S3 backup bucket to the new environment.
  • Verify that your new environment contains the expected metadata from your original environment.

Perform the version upgrade

Complete the following steps to perform the version upgrade:

  • Upgrade your environment:

aws mwaa update-environment --name <new-env-name> --airflow-version <target-version>

  • Monitor the upgrade:
    • Track the environment status on the console.
    • Watch for error messages or warnings.
    • Verify the environment reaches the AVAILABLE

Plan your transition timing carefully. When your original environment continues to process workflows during this upgrade, the metadata between environments can change.

Clean up

After you verify the stability of your upgraded environment through monitoring, you can begin the cleanup process:

  • Remove your original Amazon MWAA environment using the AWS CLI command:

 aws mwaa delete-environment --name <old-env-name>

  • Clean up your associated resources by removing unused backup data from S3 buckets, deleting temporary AWS Identity and Access Management (IAM) roles and policies created for the upgrade, and updating your DNS or routing configurations.

Before removing any resources, make sure you follow your organization’s backup retention policies, maintain necessary backup data for your compliance requirements, and document configuration changes made during the upgrade.

This approach helps you perform a controlled upgrade with opportunities for testing and the ability to return to your original environment if needed.

Monitoring and validation

You can track your upgrade progress using Amazon CloudWatch metrics, with a focus on DAG processing metrics and scheduler heartbeat. Your environment transitions through several states during the upgrade process, including UPDATING and CREATING. When your environment shows the AVAILABLE state, you can begin validation testing. We recommend checking system accessibility, testing critical workflow operations, and verifying external connections. For detailed monitoring guidance, see Monitoring and metrics for Amazon Managed Workflows for Apache Airflow.

Key considerations

Consider using infrastructure as code (IaC) practices to help maintain consistent environment management and support repeatable deployments. Schedule metadata backups using mwaa-dr during periods of low activity to help protect your data. When designing your workflows, implement idempotent pipelines to help manage potential interruptions, and maintain documentation of your configurations and dependencies.

Conclusion

A successful Amazon MWAA upgrade starts with selecting an approach that aligns with your operational requirements. Whether you choose an in-place or cutover upgrade, thorough preparation and testing help support a controlled transition. Using available tools, monitoring capabilities, and recommended practices can help you upgrade to the latest Amazon MWAA features while working to maintain your workflow operations.

For additional details and code examples on Amazon MWAA, refer to the Amazon MWAA User Guide and Amazon MWAA examples GitHub repo.

Apache, Apache Airflow, and Airflow are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.


About the Authors

Anurag Srivastava works as a Senior Big Data Cloud Engineer at Amazon Web Services (AWS), specializing in Amazon MWAA. He’s passionate about helping customers build scalable data pipelines and workflow automation solutions on AWS.

Sriharsh Adari is a Senior Solutions Architect at Amazon Web Services (AWS), where he helps customers work backwards from business outcomes to develop innovative solutions on AWS. Over the years, he has helped multiple customers on data platform transformations across industry verticals. His core area of expertise include Technology Strategy, Data Analytics, and Data Science. In his spare time, he enjoys playing sports, binge-watching TV shows, and playing Tabla.

Venu Thangalapally is a Senior Solutions Architect at AWS, based in Chicago, with deep expertise in cloud architecture, data and analytics, containers, and application modernization. He partners with Financial Services industry customers to translate business goals into secure, scalable, and compliant cloud solutions that deliver measurable value. Venu is passionate about leveraging technology to drive innovation and operational excellence. Outside of work, he enjoys spending time with his family, reading, and taking long walks.

Chandan Rupakheti is a Senior Solutions Architect at AWS. His main focus at AWS lies in the intersection of analytics, serverless, and AdTech services. He is a passionate technical leader, researcher, and mentor with a knack for building innovative solutions in the cloud. Outside of his professional life, he loves spending time with his family and friends, and listening to and playing music.