这是我的dags代码:
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date':datetime.now(),
'email': ['airflow@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}
dag = DAG('Python_call', default_args=default_args, schedule_interval= '*/10 * * * *')
t1 = BashOperator(
task_id='testairflow',
bash_command='python /var/www/projects/python_airflow/airpy/hello.py',
dag=dag)
和调度程序日志:
[2018-01-05 14:05:08,536] {jobs.py:351} DagFileProcessor484 INFO - Processing /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py took 2.278 seconds
[2018-01-05 14:05:09,712] {jobs.py:343} DagFileProcessor485 INFO - Started process (PID=29795) to work on /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py
[2018-01-05 14:05:09,715] {jobs.py:534} DagFileProcessor485 ERROR - Cannot use more than 1 thread when using sqlite. Setting max_threads to 1
[2018-01-05 14:05:09,717] {jobs.py:1521} DagFileProcessor485 INFO - Processing file /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py for tasks to queue
[2018-01-05 14:05:09,717] {models.py:167} DagFileProcessor485 INFO - Filling up the DagBag from /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py
[2018-01-05 14:05:10,057] {jobs.py:1535} DagFileProcessor485 INFO - DAG(s) dict_keys(['example_passing_params_via_test_command', 'latest_only_with_trigger', 'example_branch_operator', 'example_subdag_operator', 'latest_only', 'example_skip_dag', 'example_subdag_operator.section-1', 'example_subdag_operator.section-2', 'tutorial', 'example_http_operator', 'example_trigger_controller_dag', 'example_bash_operator', 'example_python_operator', 'test_utils', 'Python_call', 'example_trigger_target_dag', 'example_xcom', 'example_short_circuit_operator', 'example_branch_dop_operator_v3']) retrieved from /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py
[2018-01-05 14:05:12,039] {jobs.py:1169} DagFileProcessor485 INFO - Processing Python_call
[2018-01-05 14:05:12,048] {jobs.py:566} DagFileProcessor485 INFO - Skipping SLA check for <DAG: Python_call> because no tasks in DAG have SLAs
[2018-01-05 14:05:12,060] {models.py:322} DagFileProcessor485 INFO - Finding 'running' jobs without a recent heartbeat
[2018-01-05 14:05:12,061] {models.py:328} DagFileProcessor485 INFO - Failing jobs without heartbeat after 2018-01-05 14:00:12.061146
命令行气流调度程序:
[2018-01-05 14:31:20,496] {dag_processing.py:627} INFO - Started a process (PID: 32222) to generate tasks for /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py - logging into /var/www/projects/python_airflow/airpy/airflow_home/logs/scheduler/2018-01-05/scheduler.py.log
[2018-01-05 14:31:23,122] {jobs.py:1002} INFO - No tasks to send to the executor
[2018-01-05 14:31:23,123] {jobs.py:1440} INFO - Heartbeating the executor
[2018-01-05 14:31:23,123] {jobs.py:1450} INFO - Heartbeating the scheduler
[2018-01-05 14:31:24,243] {jobs.py:1404} INFO - Heartbeating the process manager
[2018-01-05 14:31:24,244] {dag_processing.py:559} INFO - Processor for /var/www/projects/python_airflow/airpy/airflow_home/dags/scheduler.py finished
我正在安排dag在气流中持续10分钟没有做任何事情。
我正在安排dag在气流中持续10分钟没有做任何事情。
我正在安排dag在气流中持续10分钟没有做任何事情。
答案 0 :(得分:2)
Airflow是一种ETL /数据流水线工具。这意味着它意味着已经执行了已经过去的事情。周期。例如。使用:
'start_date': datetime(2018,1,4)
schedule_interval='@daily'
意味着DAG不会一直运行,直到整个计划间隔单位(一天)自开始日期开始经过;因此在Airflow服务器上的时间等于datetime(2018,1,5)
。
由于您start_date
的{{1}} datetime.now()
有一个@daily
inteval(这也是默认值),因此上述条件永远不会实现(请参阅official FAQ)
您可以将start_date
参数更改为,例如昨天比今天更早地使用timedelta作为亲戚start_date
(虽然不建议这样做)。我建议使用'start_date': datetime(2018,1,1)
并在DAG参数中添加scheduler_interval='@once'
以进行测试。这应该让你的DAG运行。