Installation — Airflow Documentation (original) (raw)

Getting Airflow

Airflow is published as apache-airflow package in PyPI. Installing it however might be sometimes tricky because Airflow is a bit of both a library and application. Libraries usually keep their dependencies open and applications usually pin them, but we should do neither and both at the same time. We decided to keep our dependencies as open as possible (in setup.py) so users can install different version of libraries if needed. This means that from time to time plain pip install apache-airflow will not work or will produce unusable Airflow installation.

In order to have repeatable installation, however, starting from Airflow 1.10.10 and updated inAirflow 1.10.12 we also keep a set of “known-to-be-working” constraint files in theconstraints-master and constraints-1-10 orphan branches. Those “known-to-be-working” constraints are per major/minor python version. You can use them as constraint files when installing Airflow from PyPI. Note that you have to specify correct Airflow version and python versions in the URL.

Prerequisites

On Debian based Linux OS:

sudo apt-get update sudo apt-get install build-essential

  1. Installing just airflow

pip install
apache-airflow==1.10.12
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"

  1. Installing with extras (for example postgres, gcp)

pip install
apache-airflow[postgres,gcp]==1.10.12
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"

You need certain system level requirements in order to install Airflow. Those are requirements that are known to be needed for Linux system (Tested on Ubuntu Buster LTS) :

sudo apt-get install -y --no-install-recommends
freetds-bin
krb5-user
ldap-utils
libffi6
libsasl2-2
libsasl2-modules
libssl1.1
locales
lsb-release
sasl2-bin
sqlite3
unixodbc

You also need database client packages (Postgres or MySQL) if you want to use those databases.

If the airflow command is not getting recognized (can happen on Windows when using WSL), then ensure that ~/.local/bin is in your PATH environment variable, and add it in if necessary:

Initializing Airflow Database

Airflow requires a database to be initialized before you can run tasks. If you’re just experimenting and learning Airflow, you can stick with the default SQLite option. If you don’t want to use SQLite, then take a look atInitializing a Database Backend to setup a different database.

After configuration, you’ll need to initialize the database before you can run tasks: