我是AWS GLUE的新手,并尝试使用Lambda函数触发Glue工作流。
我正在使用属性boto3.client('glue')
,但出现错误:
胶水对象没有属性
start_workflow_run
这是我要运行的代码段:
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('glue')
client.start_workflow_run(Name = 'Workflow_New', Arguments = {})
通过实现我想做的事情还有其他方法吗?
答案 0 :(得分:1)
关于如何从Lambda调用AWS Glue(请参见代码段),请参阅此SO。
How to Trigger Glue ETL Pyspark job through S3 Events or AWS Lambda?
import boto3
print('Loading function')
def lambda_handler(event, context):
source_bucket = event['Records'][0]['s3']['bucket']['name']
s3 = boto3.client('s3')
glue = boto3.client('glue')
gluejobname = "YOUR GLUE JOB NAME"
try:
runId = glue.start_job_run(JobName=gluejobname)
status = glue.get_job_run(JobName=gluejobname, RunId=runId['JobRunId'])
print("Job Status : ", status['JobRun']['JobRunState'])
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist '
'and your bucket is in the same region as this '
'function.'.format(source_bucket, source_bucket))
raise e
谢谢
Yuva
答案 1 :(得分:0)
在下面的链接中,您可以找到aws示例并传递胶水参数。
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-calling.html
答案 2 :(得分:0)
我不认为胶水具有名为“ start_workflow_run”的功能。请尝试“ start_job_run”
响应= client.start_job_run(JobName ='Workflow_New',参数= {})
答案 3 :(得分:0)
这可以从Lambda(python)调用胶水工作流程:
import json
import boto3
def lambda_handler(event, context):
# add your region_name
glue = boto3.client(service_name='glue', region_name='eu-west-2')
# only 'Name' parameter
workflow_run_id = glue.start_workflow_run(Name = 'Your_Workflow')
print(f'workflow_run_id: {workflow_run_id}')
答案 4 :(得分:0)
请尝试使用以下代码段:
import boto3
glueClient = boto3.client('glue')
response = glueClient.start_workflow_run(Name = 'wf_name')
答案 5 :(得分:0)
尝试使用:
import json
import boto3
def lambda_handler(event, context):
glueClient = boto3.client('glue', region_name='us-west-2')
response = glueClient.start_workflow_run(Name=Workflow_name)
此外,我认为您可能还想在响应周围添加错误处理!