我目前正在尝试运行一个最小的 Airflow 环境来在我的本地机器上测试一些 DAG。设置非常基本:在初始化本地 sqlite 数据库后,我使用 docker-compose
来旋转 Airflow 网络服务器和调度程序。
我的 docker-compose.yml
文件如下所示:
version: '3'
services:
webserver:
image: apache/airflow
container_name: airflow_webserver
restart: on-failure
volumes:
- ./dags:/opt/airflow/dags
- ./plugins:/opt/airflow/plugins
- ./scripts:/scripts
ports:
- 8080:8080
environment:
AIRFLOW_RUNAS_WEBSERVER: 1
depends_on:
- scheduler
entrypoint: /scripts/wait-for-it.sh scheduler:8793 -t 40 -- /scripts/docker-entrypoint.sh
scheduler:
image: apache/airflow
container_name: airflow_scheduler
restart: on-failure
ports:
- 8793:8793
volumes:
- ./dags:/opt/airflow/dags
- ./plugins:/opt/airflow/plugins
- ./scripts:/scripts
environment:
AIRFLOW_RUNAS_SCHEDULER: 1
entrypoint: /scripts/docker-entrypoint.sh
我的 docker-entrypoint.sh
在哪里:
#!/bin/bash
AIRFLOW_USERNAME=test
AIRFLOW_PASSWORD=test
if [ "$AIRFLOW_RUNAS_SCHEDULER" = "1" ]; then
echo "initializing airflow database"
airflow db init
echo "running airflow scheduler"
while true
do
airflow scheduler
echo "restarting airflow scheduler"
sleep 1
done
elif [ "$AIRFLOW_RUNAS_WEBSERVER" = "1" ]; then
echo "running airflow webserver"
airflow db upgrade
airflow users create \
--username=$AIRFLOW_USERNAME \
--password=$AIRFLOW_PASSWORD \
--firstname=test \
--lastname=test \
--email=test@test.com \
--role=Admin
exec airflow webserver -p 8080
elif [ "$AIRFLOW_RUNAS_WORKER" = "1" ]; then
echo "running airflow worker"
exec airflow worker
elif [ "$AIRFLOW_RUNAS_FLOWER" = "1" ]; then
echo "running airflow flower"
exec airflow flower
else
echo "ERROR: no AIRFLOW_RUNAS_* variable set"
exit 1
fi
现在,从日志中我可以清楚地看到,运行 docker-compose up
airflow webserver
和 airflow scheduler
都能正确启动他们的服务,即使 Airflow 网络服务器似乎很难找到调度程序.在 UI 上,我受到了热烈的欢迎:
调度程序似乎没有运行。 DAG 列表可能不会更新,也不会安排新任务。
我应该如何找出这两个服务不相互通信的原因?
奇怪的是,如果我调整我的 docker-entrypoint.sh
以在同一个服务中运行所有步骤,一切都很好
exec airflow webserver -p 8080 & exec airflow scheduler
这使得后续变得复杂,因为所有内容都由同一个容器记录。