GitHub - aws/amazon-mwaa-docker-images (original) (raw)

aws-mwaa-docker-images

Overview

This repository contains the Docker Images that Amazon MWAA uses to run Airflow.

You can also use it locally if you want to run a MWAA-like environment for testing, experimentation, and development purposes.

Currently, Airflow v2.9.2 and above are supported. Future versions in parity with Amazon MWAA will be added as well. Notice, however, that we do not plan to support previous Airflow versions supported by MWAA.

Using the Airflow Image

To experiment with the image using a vanilla Docker setup, follow these steps:

  1. (Prerequisites) Ensure you have:
  2. Clone this repository.
  3. This repository makes use of Python virtual environments. To create them, from the root of the package, execute the following command:
python3 create_venvs.py --target <development | production>
  1. Build a supported Airflow version Docker image
    • cd <amazon-mwaa-docker-images path>/images/airflow/2.9.2
    • Update run.sh file with your account ID, environment name and account credentials, api-server URL
    • (http://host_name:8080). The permissions associated with the provided credentials will be assigned to the Airflow components that would be started with the next step. So, if you receive any error message indicating lack of permissions, then try providing the permissions to the identity whose credentials were used.
    • ./run.sh This will build and run all the necessary containers and automatically create the following CloudWatch log groups:
      * {ENV_NAME}-DAGProcessing
      * {ENV_NAME}-Scheduler
      * {ENV_NAME}-Worker
      * {ENV_NAME}-Task
      * {ENV_NAME}-WebServer

Airflow should be up and running now. You can access the web server on your localhost on port 8080.

Authentication from version 3.0.1 onward

For environments created using this repository starting with version 3.0.1, we default to using SimpleAuthManager, which is also the default auth manager in Airflow 3.0.0+. By default, SIMPLE_AUTH_MANAGER_ALL_ADMINS is set to true, which means no username/password is required, and all users will have admin access. You can specify users and roles using the SIMPLE_AUTH_MANAGER_USERS environment variable in the format:

username:role[,username2:role2,...]

To enforce authentication with explicit user passwords and roles, set:

SIMPLE_AUTH_MANAGER_ALL_ADMINS=false

In this mode, a password will be automatically generated for each user and printed in the webserver logs as soon as webserver starts.

Generated Docker Images

When you build the Docker images of a certain Airflow version, using either build.sh or run.sh(which automatically also calls build.sh for you), multiple Docker images will actually be generated. For example, for Airflow 2.9, you will notice the following images:

Repository Tag
amazon-mwaa-docker-images/airflow 2.9.2
amazon-mwaa-docker-images/airflow 2.9.2-dev
amazon-mwaa-docker-images/airflow 2.9.2-explorer
amazon-mwaa-docker-images/airflow 2.9.2-explorer-dev
amazon-mwaa-docker-images/airflow 2.9.2-explorer-privileged
amazon-mwaa-docker-images/airflow 2.9.2-explorer-privileged-dev

Each of the postfixes added to the image tag represents a certain build type, as explained below:

Extra commands

Requirements

./run.sh test-requirements

Startup script

./run.sh test-startup-script

Reset database

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.