气流无法将日志写入s3(v1.10.9)

时间:2020-02-13 01:18:07

标签: kubernetes airflow kubernetes-helm

我正在尝试在stable/airflow的气流v1.10.9掌舵图中设置远程日志记录,我正在使用 Kubernetes执行器puckel/docker-airflow图像。这是我的values.yaml文件。

airflow:
  image:
     repository: airflow-docker-local
     tag: 1.10.9
  executor: Kubernetes
  service:
    type: LoadBalancer
  config:
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.9
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
    AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
    AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
    AIRFLOW__KUBERNETES__NAMESPACE: airflow
    AIRFLOW__CORE__REMOTE_LOGGING: True
    AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: "s3://xxx"
    AIRFLOW__CORE__REMOTE_LOG_CONN_ID: "s3://aws_access_key_id:aws_secret_access_key@bucket"
    AIRFLOW__CORE__ENCRYPT_S3_LOGS: False
persistence:
  enabled: true
  existingClaim: ''
postgresql:
  enabled: true
workers:
  enabled: false
redis:
  enabled: false
flower:
  enabled: false

但是我的日志没有导出到S3,我在UI上得到的只是

*** Log file does not exist: /usr/local/airflow/logs/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log
*** Fetching from: http://icpjobdagicpkubejob-f4144a374f7a4ac9b18c94f058bc7672:8793/log/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='icpjobdagicpkubejob-f4144a374f7a4ac9b18c94f058bc7672', port=8793): Max retries exceeded with url: /log/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f511c883710>: Failed to establish a new connection: [Errno -2] Name or service not known'))

任何人都有更多见解,我可能会错过什么?

编辑:来自下面@trejas的建议。我创建了一个单独的连接并使用它。这是我在values.yaml中的气流配置

airflow:
  image:
     repository: airflow-docker-local
     tag: 1.10.9
  executor: Kubernetes
  service:
    type: LoadBalancer
  connections:
  - id: my_aws
    type: aws
    extra: '{"aws_access_key_id": "xxxx", "aws_secret_access_key": "xxxx", "region_name":"us-west-2"}'
  config:
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.9
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
    AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
    AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
    AIRFLOW__KUBERNETES__NAMESPACE: airflow

    AIRFLOW__CORE__REMOTE_LOGGING: True
    AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: s3://airflow.logs
    AIRFLOW__CORE__REMOTE_LOG_CONN_ID: my_aws
    AIRFLOW__CORE__ENCRYPT_S3_LOGS: False

我还是有同样的问题。

2 个答案:

答案 0 :(得分:1)

我遇到了同样的问题,以为我会继续研究最终对我有用的东西。连接正确,但是您需要确保辅助容器具有相同的环境变量:

airflow:
  image:
     repository: airflow-docker-local
     tag: 1.10.9
  executor: Kubernetes
  service:
    type: LoadBalancer
  connections:
  - id: my_aws
    type: aws
    extra: '{"aws_access_key_id": "xxxx", "aws_secret_access_key": "xxxx", "region_name":"us-west-2"}'
  config:
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.9
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
    AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
    AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
    AIRFLOW__KUBERNETES__NAMESPACE: airflow

    AIRFLOW__CORE__REMOTE_LOGGING: True
    AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: s3://airflow.logs
    AIRFLOW__CORE__REMOTE_LOG_CONN_ID: my_aws
    AIRFLOW__CORE__ENCRYPT_S3_LOGS: False
    AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__REMOTE_LOGGING: True
    AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__REMOTE_LOG_CONN_ID: my_aws
    AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: s3://airflow.logs
    AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__ENCRYPT_S3_LOGS: False

我还必须为工作人员(通常)设置Fernet密钥,否则会收到无效的令牌错误:

airflow:
  fernet_key: "abcdefghijkl1234567890zxcvbnmasdfghyrewsdsddfd="

  config:
    AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__FERNET_KEY: "abcdefghijkl1234567890zxcvbnmasdfghyrewsdsddfd="

答案 1 :(得分:0)

您的远程日志连接ID必须是连接表单/列表中的连接ID。不是连接字符串。

https://airflow.apache.org/docs/stable/howto/write-logs.html

https://airflow.apache.org/docs/stable/howto/connection/index.html