使用Airflow删除S3存储桶对象

时间:2019-09-16 09:59:42

标签: amazon-s3 airflow

在Airflow中使用S3DeleteObjectsOperator不会删除指定S3存储桶中的对象,即使该任务表明它已成功删除

delete_s3bucket_files = S3DeleteObjectsOperator(
  task_id='delete_s3bucket_files',
  start_date=start_date,
  bucket='*********************',
  keys='*********************',
  aws_conn_id='aws_default',
)

因此任务运行显示密钥已删除,但它仍然存在于我的S3存储桶中。

[2019-09-16 11:39:25,775] {base_task_runner.py:101} INFO - Job 1346: Subtask delete_s3bucket_files [2019-09-16 11:39:25,775] {cli.py:517} INFO - Running <TaskInstance: daily_database_transfer.delete_s3bucket_files 2019-09-16T09:39:16.873030+00:00 [running]> on host Saurav-macbook.local
[2019-09-16 11:39:25,971] {s3_delete_objects_operator.py:83} INFO - Deleted: ['*********************']

这里是否有我想念的东西,或者有什么方法可以找出为什么它不删除对象?

1 个答案:

答案 0 :(得分:0)

您可以将S3DeleteBucketOperatorforce_delete=True一起使用,以强制删除存储桶中的所有对象。

所以您可以这样做:

from airflow.providers.amazon.aws.operators.s3_bucket import S3DeleteBucketOperator

delete_s3bucket = S3DeleteBucketOperator(
  task_id='delete_s3bucket_task',
  force_delete=True,
  start_date=start_date,
  bucket_name='*********************',
  aws_conn_id='aws_default',
)

如果您希望分别删除文件和删除存储桶,则可以执行以下操作:

from airflow.providers.amazon.aws.operators.s3_bucket import S3DeleteBucketOperator
from airflow.providers.amazon.aws.operators.s3_delete_objects import S3DeleteObjectsOperator

delete_s3bucket_files = S3DeleteObjectsOperator(
  task_id='delete_s3bucket_files',
  start_date=start_date,
  bucket='*********************',
  keys='*********************',
  aws_conn_id='aws_default',
)

delete_s3bucket = S3DeleteBucketOperator(
  task_id='delete_s3bucket_task',
  force_delete=False, #bucket will be deleted only if it's empty.
  start_date=start_date,
  bucket_name='*********************',
  aws_conn_id='aws_default',
)

delete_s3bucket_files >> delete_s3bucket