Airflow Docker is an extension to the open source project Airflow. Specifically it provides a base operator, forked from the existing docker operator, and a number of operators, and sensors on top of it, all that are fundamentally a wrapped docker run
command.
By standardizing around a single execution pattern, namely everything is a docker operator, a number of benefits fall into place:
- All of the normal benefits of docker. Shared layers, immutable artifacts, artifact versioning, etc.
- Because everything, from a sensor, to a short circuit operation, to a sql query type operation, to a standard python type operation, we were able to begin to build useful building blocks that augmented this standard run time behavior.
- Isolation with respect to the airflow deployment itself - we can feel a lot more confident upgrading airflow or one of its dependencies if that has almost no chance of breaking somene's tasks.
- airflow-docker: This is the core library. It is inteded to be installed in the same environment as airflow. We publish python packages to the python package index and a couple of purpose build docker images to dockerhub.
- airflow-docker-helper: This is a lightweight, pure python library intended to be installed in your docker images that provides useful primitives for interacting with airflow from within the context of a running docker container.
... And more