我正在使用Airflow 1.9版,他们的软件中存在一个错误,您可以阅读有关here on my previous Stackoverflow post,here on another one of my Stackoverflow posts和here on Airflow's Github where the bug is reported and discussed的信息。
长话短说,Airflow的代码中有几个位置需要获取服务器的IP地址。他们通过运行以下命令来完成此任务:
socket.getfqdn()
问题在于,在Amazon EC2实例(Amazon Linux 1)上,此命令不返回IP地址,而是返回如下主机名:
IP-1-2-3-4
需要这样的IP地址的地方:
1.2.3.4
要获取此IP值,我从here发现可以使用以下命令:
socket.gethostbyname(socket.gethostname())
我已经在Python shell中测试了该命令,它返回了正确的值。因此,我在Airflow软件包上进行了搜索,以查找所有socket.getfqdn()
出现的地方,这就是我回来的原因:
[airflow@ip-1-2-3-4 site-packages]$ cd airflow/
[airflow@ip-1-2-3-4 airflow]$ grep -r "fqdn" .
./security/utils.py: fqdn = host
./security/utils.py: if not fqdn or fqdn == '0.0.0.0':
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: return '%s/%s@%s' % (components[0], fqdn.lower(), components[2])
./security/utils.py: return socket.getfqdn()
./security/utils.py:def get_fqdn(hostname_or_ip=None):
./security/utils.py: fqdn = socket.gethostbyaddr(hostname_or_ip)[0]
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: fqdn = hostname_or_ip
./security/utils.py: if fqdn == 'localhost':
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: return fqdn
Binary file ./security/__pycache__/utils.cpython-36.pyc matches
Binary file ./security/__pycache__/kerberos.cpython-36.pyc matches
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.getfqdn())
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.getfqdn())
Binary file ./contrib/auth/backends/__pycache__/kerberos_auth.cpython-36.pyc matches
./contrib/auth/backends/kerberos_auth.py: service_principal = "%s/%s" % (configuration.get('kerberos', 'principal'), utils.get_fqdn())
./www/views.py: 'airflow/circles.html', hostname=socket.getfqdn()), 404
./www/views.py: hostname=socket.getfqdn(),
Binary file ./www/__pycache__/app.cpython-36.pyc matches
Binary file ./www/__pycache__/views.cpython-36.pyc matches
./www/app.py: 'hostname': socket.getfqdn(),
Binary file ./__pycache__/jobs.cpython-36.pyc matches
Binary file ./__pycache__/models.cpython-36.pyc matches
./bin/cli.py: hostname = socket.getfqdn()
Binary file ./bin/__pycache__/cli.cpython-36.pyc matches
./config_templates/default_airflow.cfg:# gets augmented with fqdn
./jobs.py: self.hostname = socket.getfqdn()
./jobs.py: fqdn = socket.getfqdn()
./jobs.py: same_hostname = fqdn == ti.hostname
./jobs.py: "{fqdn}".format(**locals()))
Binary file ./api/auth/backend/__pycache__/kerberos_auth.cpython-36.pyc matches
./api/auth/backend/kerberos_auth.py:from socket import getfqdn
./api/auth/backend/kerberos_auth.py: hostname = getfqdn()
./models.py: self.hostname = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
我不确定是否应该将所有出现的socket.getfqdn()
命令替换为socket.gethostbyname(socket.gethostname())
。对于一个来说,这将很麻烦,因为我将不再使用从Pip安装的Airflow软件包。我曾尝试升级到Airflow 1.10版,但它有很多错误,无法启动并运行。因此,目前看来,我仍停留在Airflow 1.9版上,但我需要更正此Airflow错误,因为它会导致我的任务偶尔失败。
答案 0 :(得分:0)
只需用有效的函数替换所有出现故障的函数调用。这是我执行的步骤。如果您使用的是Airflow群集,请确保对所有Airflow服务器(主服务器和工作服务器)都执行此操作。
[ec2-user@ip-1-2-3-4 ~]$ cd /usr/local/lib/python3.6/site-packages/airflow
[ec2-user@ip-1-2-3-4 airflow]$ grep -r "socket.getfqdn()" .
./security/utils.py: return socket.getfqdn()
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.getfqdn())
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.getfqdn())
./www/views.py: 'airflow/circles.html', hostname=socket.getfqdn()), 404
./www/views.py: hostname=socket.getfqdn(),
./www/app.py: 'hostname': socket.getfqdn(),
./bin/cli.py: hostname = socket.getfqdn()
./jobs.py: self.hostname = socket.getfqdn()
./jobs.py: fqdn = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
[ec2-user@ip-1-2-3-4 airflow]$ sudo find . -type f -exec sed -i 's/socket.getfqdn()/socket.gethostbyname(socket.gethostname())/g' {} +
[ec2-user@ip-1-2-3-4 airflow]$ grep -r "socket.getfqdn()" .
[ec2-user@ip-1-2-3-4 airflow]$ grep -r "socket.gethostbyname(socket.gethostname())" .
./security/utils.py: return socket.gethostbyname(socket.gethostname())
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.gethostbyname(socket.gethostname()))
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.gethostbyname(socket.gethostname()))
./www/views.py: 'airflow/circles.html', hostname=socket.gethostbyname(socket.gethostname())), 404
./www/views.py: hostname=socket.gethostbyname(socket.gethostname()),
./www/app.py: 'hostname': socket.gethostbyname(socket.gethostname()),
./bin/cli.py: hostname = socket.gethostbyname(socket.gethostname())
./jobs.py: self.hostname = socket.gethostbyname(socket.gethostname())
./jobs.py: fqdn = socket.gethostbyname(socket.gethostname())
./models.py: self.hostname = socket.gethostbyname(socket.gethostname())
./models.py: self.hostname = socket.gethostbyname(socket.gethostname())
进行此更新后,只需重新启动Airflow Webserver,Scheduler和Worker进程即可,您应该已经准备就绪。请注意,当我进入python软件包以获取气流时,我正在很好地使用python 3.6,有些人可能会喜欢3.7,因此您的路径可能必须调整为/usr/local/lib/python3.7/site -packages / airflow,所以只需将CD放入/ usr / local / lib,然后查看您必须进入的python文件夹。我不认为气流会流到这个位置,但有时python软件包也位于此处/usr/local/lib64/python3.6/site-packages,因此路径上的差异在于它是lib64而不是lib。另外,请记住,此问题在Airflow版本1.10中已修复,因此您无需在最新版本的Airflow中进行这些更改。