无法运行气流调度程序

时间:2019-05-30 19:14:12

标签: postgresql amazon-ec2 ubuntu-16.04 airflow airflow-scheduler

我最近通过针对Ubuntu 16.04使用this guide在AWS服务器上安装了气流。经过痛苦而成功的安装后,启动了Web服务器。我尝试了以下示例dag

from airflow.operators.python_operator import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import timedelta
from airflow import DAG
import airflow


# DEFAULT ARGS
default_args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(2),
'depends_on_past': False}


dag = DAG('init_run', default_args=default_args, description='DAG SAMPLE',
schedule_interval='@daily')


def print_something():
        print("HELLO AIRFLOW!")


with dag:
        task_1 = PythonOperator(task_id='do_it', python_callable=print_something)
        task_2 = DummyOperator(task_id='dummy')

        task_1 << task_2

但是,当我打开UI时,无论我手动触发或刷新页面多少次,dag中的任务仍然处于 “无状态”

后来我发现气流调度程序未运行,并显示以下错误:

{celery_executor.py:228} ERROR - Error sending Celery task:No module named 'MySQLdb'
Celery Task ID: ('init_run', 'dummy', datetime.datetime(2019, 5, 30, 18, 0, 24, 902499, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/executors/celery_executor.py", line 118, in send_task_to_executor
    result = task.apply_async(args=[command], queue=queue)
  File "/usr/local/lib/python3.7/site-packages/celery/app/task.py", line 535, in apply_async
    **options
  File "/usr/local/lib/python3.7/site-packages/celery/app/base.py", line 728, in send_task
    amqp.send_task_message(P, name, message, **options)
  File "/usr/local/lib/python3.7/site-packages/celery/app/amqp.py", line 552, in send_task_message
    **properties
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/usr/local/lib/python3.7/site-packages/kombu/connection.py", line 510, in _ensured
    return fun(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in <listcomp>
    [maybe_declare(entity) for entity in declare]
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 121, in maybe_declare
    return _maybe_declare(entity, channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 145, in _maybe_declare
    entity.declare(channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 608, in declare
    self._create_queue(nowait=nowait, channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 617, in _create_queue
    self.queue_declare(nowait=nowait, passive=False, channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 652, in queue_declare
    nowait=nowait,
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 531, in queue_declare
    self._new_queue(queue, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 82, in _new_queue
    self._get_or_create(queue)
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 70, in _get_or_create
    obj = self.session.query(self.queue_cls) \
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 65, in session
    _, Session = self._open()
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 56, in _open
    engine = self._engine_from_config()
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 51, in _engine_from_config
    return create_engine(conninfo.hostname, **transport_options)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 443, in create_engine
    return strategy.create(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 87, in create
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 104, in dbapi
    return __import__("MySQLdb")
ModuleNotFoundError: No module named 'MySQLdb'

这是配置文件(airflow.cfg)中的设置:

sql_alchemy_conn = postgresql+psycopg2://airflow@localhost:5432/airflow
broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
result_backend =  db+postgresql://airflow:airflow@localhost/airflow

我已经为这个问题苦苦挣扎了两天,请帮助

2 个答案:

答案 0 :(得分:0)

在您的airflow.cfg中,还应该有celery_result_backend的配置选项。您能否让我们知道此值设置为什么?如果您的配置中没有它,请将其设置为与result_backend

相同的值

即:

celery_result_backend =  db+postgresql://airflow:airflow@localhost/airflow

然后重新启动气流堆栈,以确保应用配置更改。

(我想将此留为评论,但没有足够的代表这样做)

答案 1 :(得分:0)

我认为您遵循的示例没有告诉您安装mysql,似乎您在代理URL中使用了它。

您可以安装mysql,然后进行配置。 (适用于python 3.5 +)

pip install mysqlclient

或者,以进行快速修复。您还可以使用Rabbit MQ(Rabbitmq是消息代理,您将需要使用celery重新运行气流障碍)来宾用户登录

然后您的broker_url将是

broker_url = amqp://guest:guest@localhost:5672//

如果尚未安装,则可以使用以下命令安装Rabbitmq。

sudo apt install rabbitmq-server

在位于以下位置的配置文件中更改配置NODE_IP_ADDRESS = 0.0.0.0

/etc/rabbitmq/rabbitmq-env.conf

启动RabbitMQ服务

sudo service rabbitmq-server start