Question

正在尝试根据S3中*.tar文件的存在来激活数据管道。我创建了一个Lambda函数，并编写了Python Boto 3代码来激活数据管道。我已经测试过Lambda函数，并且发现.tar文件存在时，数据管道已激活，如果不存在，则数据管道未激活。

我正试图了解这些问题的原因：

如果s3位置中没有tar文件，则print ("datapipeline not activated")不会记录在日志中。
如果我在上一次运行中中断了数据管道，并且在数据管道完成之前将其标记为完成，则再次触发了lambda函数，我得到以下错误。

错误：只能在按需管道的默认对象上设置字段'maxActiveInstances'
当我尝试在数据管道中的EMR资源下设置“ maxActiveInstances”时，

{ “ errorMessage”：“调用ActivatePipeline操作时发生错误（InvalidRequestException）：超出了Web服务的限制：并发执行次数过多。请在管道中将字段'maxActiveInstances'设置为更高的值，或者等待当前正在运行的执行完成后再试”， “ errorType”：“ InvalidRequestException”， “堆栈跟踪”： [ [ “ /var/task/lambda_function.py”， 21岁 “ lambda_handler”， “激活= client.activate_pipeline（pipelineId = data_pipeline_id，parameterValues = []）” ]， [ “ /var/runtime/botocore/client.py”， 314， “ _api_call”， “返回自我。_make_api_call（operation_name，kwargs）” ]， [ “ /var/runtime/botocore/client.py”， 612， “ _make_api_call”， “提高error_class（parsed_response，operation_name）” ] ] }

这是Python脚本，请提供解决这些问题的指导。

import boto3
import logging
logger = logging.getLogger()

def lambda_handler(event, context):
client = boto3.client('datapipeline')
s3_client = boto3.client('s3')
#client = boto3.client('datapipeline')
data_pipeline_id="df-xxxxxxxx"
bucket = 'xxxxx'
prefix = 'xxxx/xxxxx/'
paginator = s3_client.get_paginator('list_objects_v2')
response_iterator = paginator.paginate(Bucket=bucket, Prefix=prefix)
response_pipeline = client.describe_pipelines(pipelineIds=[data_pipeline_id])
for response in response_iterator:
for object_data in response['Contents']:
key = object_data['Key']
    #print (key)
if key.endswith('.tar'):
if(response_pipeline):
activate = client.activate_pipeline(pipelineId=data_pipeline_id,parameterValues=[])
print ("activated")
else:
print ("datapipeline not activated")

Answer 1

我认为我也已经看到了相同的症状，希望分享我们的修复程序可能对您有所帮助？

我们已取消了管道的实例，需要重新启用管道以克服此错误。

AWS Lambda激活数据管道

1 个答案: