我在Airflow中的大多数DAG(版本1.10.6和1.10.7)都能够成功地将日志上传到S3,并且我可以在Airflow UI中查看这些远程日志。但是,我有一些DAG在完成DAG运行后(无论成功与否)都不会将DAG日志上传到S3。它始终表现出相同的DAG,但是与能够将日志上传到S3的DAG相比,我找不到任何可以阻止DAG日志上传的代码差异。
我注意到没有将日志上传到S3的DAG总是具有这样的最终运行记录:
2020-01-29 20:12:06,763 - INFO - Task exited with return code 0 - None
将日志上传到S3的DAG没有此类日志条目。在所有DAG中,我还可以看到类似下面的记录,但是在未将日志上传到S3的DAG中,我看到的这些消息的数量大约是它的2倍:
DEBUG - The s3 config key is not a dictionary type, ignoring its value of: None
另外,要上传到S3的DAG具有与S3相关的Airflow连接池/ AWS引用,如下所示:
[2020-01-29 20:41:32,852] {{connectionpool.py:203}} INFO - Starting new HTTP connection (1): <IP>
[2020-01-29 20:41:32,895] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (1): <BUCKET>.s3.amazonaws.com
[2020-01-29 20:41:32,932] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (2): <BUCKET>.s3.amazonaws.com
[2020-01-29 20:41:32,949] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (1): <BUCKET>.s3.<REGION>.amazonaws.com
[2020-01-29 20:41:32,984] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (2): <BUCKET>.s3.<REGION>.amazonaws.com
[2020-01-29 20:41:33,040] {{connectionpool.py:203}} INFO - Starting new HTTP connection (1): <IP>
[2020-01-29 20:41:33,065] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (1): <BUCKET>.s3.amazonaws.com
[2020-01-29 20:41:33,084] {{connectionpool.py:735}} INFO - Starting new HTTPS connection (1): <BUCKET>.s3.<REGION>.amazonaws.com
但是,未上载到S3的DAG没有此类条目。相反,它们具有以下内容,可以将日志上传到S3的DAG中找不到这些内容:
2020-01-29 20:12:06,763 - INFO - Task exited with return code 0 - None
2020-01-29 20:12:06,942 - ERROR - Exception during reset or similar - None
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 693, in _finalize_fairy
fairy._reset(pool)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 880, in _reset
pool._dialect.do_rollback(self)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 538, in do_rollback
dbapi_connection.rollback()
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
Exception ignored in: <function _ConnectionRecord.checkout.<locals>.<lambda> at 0x7f94e04fa3b0>
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 503, in <lambda>
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 702, in _finalize_fairy
File "/usr/local/lib/python3.7/logging/__init__.py", line 1407, in error
File "/usr/local/lib/python3.7/logging/__init__.py", line 1514, in _log
File "/usr/local/lib/python3.7/logging/__init__.py", line 1524, in handle
File "/usr/local/lib/python3.7/logging/__init__.py", line 1586, in callHandlers
File "/usr/local/lib/python3.7/logging/__init__.py", line 894, in handle
File "/usr/local/lib/python3.7/logging/__init__.py", line 1126, in emit
File "/usr/local/lib/python3.7/logging/__init__.py", line 1116, in _open
NameError: name 'open' is not defined
我不太确定如何解决此问题。我尝试过在黑暗中投掷各种飞镖失败:
但是,以上所有内容均未对该问题产生任何影响。由于敏感信息/凭证/等,我不愿发布更详细的日志转储。会被转储到DEBUG日志中,但是如果我可以提供其他信息,请告诉我,我会尽力提供。
答案 0 :(得分:0)
我们使用的日志记录库未正确配置root和child记录器,我认为这会影响Airflow的记录器配置。修改非Airflow日志库以使用子级配置其自己的“ root”可以解决此问题,以及我们遇到的其他与Airflow无关的其他日志记录问题。
https://docs.python.org/3/howto/logging-cookbook.html#using-logging-in-multiple-modules