Boto3 / Jenkins客户端在运行代码时引发错误

时间:2019-04-19 20:10:25

标签: python jenkins aws-glue glue

我正在我们的一台AWS机器上运行每日胶水脚本,我使用jenkins对其进行了调度。

最近15天内,我得到以下信息。 (这项日常工作已经运行了将近6个月,而自开始工作15天以来突然之间)

jenkins控制台的输出如下所示

Started by timer
Building in workspace /var/lib/jenkins/workspace/build_name_xyz
[build_name_xyz] $ /bin/sh -xe /tmp/jenkins8188702635955396537.sh
+ /usr/bin/python3 /var/lib/jenkins/path_to_script/glue_crawler.py
Traceback (most recent call last):
  File "/var/lib/jenkins/path_to_script/glue_crawler.py", line 10, in <module>
    response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidInputException: An error occurred (InvalidInputException) when calling the UpdateCrawler operation: Cannot update Crawler while running. Please stop crawl or wait until it completes to update.
Build step 'Execute shell' marked build as failure
Finished: FAILURE

所以,我继续看过该文件中的第10行

/var/lib/jenkins/path_to_script/glue_crawler.py

看起来像这样。

import boto3
import datetime

glue_client = boto3.client('glue', region_name='region_name')


crawler_name = 'xyz_abc'
today = (datetime.datetime.now()).strftime("%Y_%m_%d")
update_path = 's3://path-to-respective-aws-s3-bucket/%s' % (today)
response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
response_crawler = glue_client.start_crawler(
    Name=crawler_name
)
print(response_crawler)

上面的代码在第10行抛出了一个错误。我不明白第10行到底出了什么问题,因此詹金斯兄弟用红球抛出了一个错误,在这里寻求帮助。我尝试对此进行谷歌搜索,但找不到任何内容。

仅供参考...如果一段时间后我使用jenkins UI运行相同的构建(通过单击“立即构建”),则该作业运行得很好。

不确定此处到底出了什么问题,我们将不胜感激。

提前谢谢!

1 个答案:

答案 0 :(得分:1)

该错误不言自明:

Cannot update Crawler while running. Please stop crawl or wait until it completes to update.

因此,某种程度上,搜寻器大约是在同一时间启动的,并且在Glue中,它不允许在运行时更新搜寻器属性。请检查是否还有其他启动名称为xyz_abc的搜寻器的任务。除了在AWS控制台make sure the crawler is configured to run on demand中而不是按计划进行。