我真的是Airflow的新手,现在正在使用docker Airflow v1.7.1.3。
在一个DAG中,我有一个BashOperator:
B1 = BashOperator(
task_id="bz2",
bash_command="""sshpass -p 'xxxx' ssh -o StrictHostKeyChecking=no hadoop@10.220.52.69 "\
spark2-submit \
--conf "spark.kryoserializer.buffer.max=1g" \
--conf "spark.yarn.executor.memoryOverhead=3g" \
--conf "spark.driver.memory=15g" \
--conf "spark.driver.cores=4"\
/home/hadoop/test/bz2.py \
--path_bz2 /ABC/{year}/{month}/{day}/single_{year}_{month}_{day}.tar.bz2"
""".format(year=YEAR, month=MONTH, day=DAY),
retries=3,
dag=dag
)
当我运行DAG时,它失败了。在日志中,它显示
[2020-04-20 08:23:07,324] {bash_operator.py:73} INFO - Output:
[2020-04-20 08:23:07,325] {bash_operator.py:77} INFO - /tmp/airflowtmpXi0HH8/bz2_transformftkAnI: line 1: sshpass: command not found
[2020-04-20 08:23:07,326] {bash_operator.py:80} INFO - Command exited with return code 127
但是,在激活了气流docker容器的服务器上,sshpass
命令已经安装并且可以正常运行。
(base) hadoop@OL-HADOOP-APP-01:~$ sshpass
Usage: sshpass [-f|-d|-p|-e] [-hV] command parameters
-f filename Take password to use from file
-d number Use number as file descriptor for getting password
-p password Provide password as argument (security unwise)
-e Password is passed as env-var "SSHPASS"
With no parameters - password will be taken from stdin
-h Show help (this screen)
-V Print version information
At most one of -f, -d, -p or -e should be used
该如何解决?谢谢!