Installation — Airflow Documentation (original) (raw)
Getting Airflow¶
Airflow is published as apache-airflow
package in PyPI. Installing it however might be sometimes tricky because Airflow is a bit of both a library and application. Libraries usually keep their dependencies open and applications usually pin them, but we should do neither and both at the same time. We decided to keep our dependencies as open as possible (in setup.py
) so users can install different version of libraries if needed. This means that from time to time plain pip install apache-airflow
will not work or will produce unusable Airflow installation.
In order to have repeatable installation, however, starting from Airflow 1.10.10 and updated inAirflow 1.10.12 we also keep a set of “known-to-be-working” constraint files in theconstraints-master
and constraints-1-10
orphan branches. Those “known-to-be-working” constraints are per major/minor python version. You can use them as constraint files when installing Airflow from PyPI. Note that you have to specify correct Airflow version and python versions in the URL.
Prerequisites
On Debian based Linux OS:
sudo apt-get update sudo apt-get install build-essential
- Installing just airflow
pip install
apache-airflow==1.10.12
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"
- Installing with extras (for example postgres, gcp)
pip install
apache-airflow[postgres,gcp]==1.10.12
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"
You need certain system level requirements in order to install Airflow. Those are requirements that are known to be needed for Linux system (Tested on Ubuntu Buster LTS) :
sudo apt-get install -y --no-install-recommends
freetds-bin
krb5-user
ldap-utils
libffi6
libsasl2-2
libsasl2-modules
libssl1.1
locales
lsb-release
sasl2-bin
sqlite3
unixodbc
You also need database client packages (Postgres or MySQL) if you want to use those databases.
If the airflow
command is not getting recognized (can happen on Windows when using WSL), then ensure that ~/.local/bin
is in your PATH
environment variable, and add it in if necessary:
Initializing Airflow Database¶
Airflow requires a database to be initialized before you can run tasks. If you’re just experimenting and learning Airflow, you can stick with the default SQLite option. If you don’t want to use SQLite, then take a look atInitializing a Database Backend to setup a different database.
After configuration, you’ll need to initialize the database before you can run tasks: