我希望编辑我的lambda,以便当作业状态显示为“ Complete”时它将删除转录作业。我有以下代码:
import json
import time
import boto3
from urllib.request import urlopen
def lambda_handler(event, context):
transcribe = boto3.client("transcribe")
s3 = boto3.client("s3")
if event:
file_obj = event["Records"][0]
bucket_name = str(file_obj["s3"]["bucket"]["name"])
file_name = str(file_obj["s3"]["object"]["key"])
s3_uri = create_uri(bucket_name, file_name)
file_type = file_name.split("2019.")[1]
job_name = file_name
transcribe.start_transcription_job(TranscriptionJobName=job_name,
Media ={"MediaFileUri": s3_uri},
MediaFormat = file_type,
LanguageCode = "en-US",
Settings={
"VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
"ShowSpeakerLabels": True,
"MaxSpeakerLabels": 4
})
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
break
print("It's in progress")
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name
)
time.sleep(5)
load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
load_json = json.dumps(json.load(load_url))
s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)
# TODO implement
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
def create_uri(bucket_name, file_name):
return "s3://"+bucket_name+"/"+file_name
处理工作的部分是:
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
break
print("It's in progress")
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name
)
如果作业正在进行中,它将显示“正在进行中”,但是当它显示为“已完成”时,它将被删除。
任何想法为何我当前的代码将无法正常工作?它将完成转录作业,但不会将其删除。
答案 0 :(得分:2)
如果可以避免的话,就不要轮询信息,尤其是在Lambda中。
应对转录工作状态变化的正确方法是使用use CloudWatch Events。例如,您可以配置一个规则,以在转录作业成功完成后将事件路由到AWS Lambda函数。
由于转录作业中的状态更改而导致调用Lambda函数时,Lambda函数将接收event
数据,例如:
{
"version": "0",
"id": "1a234567-1a6d-3ab4-1234-abf8b19be1234",
"detail-type": "Transcribe Job State Change",
"source": "aws.transcribe",
"account": "123456789012",
"time": "2019-11-19T10:00:05Z",
"region": "us-east-1",
"resources": [],
"detail": {
"TranscriptionJobName": "my-transcribe-test",
"TranscriptionJobStatus": "COMPLETED"
}
}
使用TranscriptionJobName
将状态更改与原始作业相关联。
答案 1 :(得分:1)
对不起,我又看了一眼,犯了一个非常非常愚蠢的错误。我在完全不正确的部分中有transcribe.delete_transcription_job(TranscriptionJobName=job_name
。
请在下面找到正确且有效的代码:
import json
import time
import boto3
from urllib.request import urlopen
def lambda_handler(event, context):
transcribe = boto3.client("transcribe")
s3 = boto3.client("s3")
if event:
file_obj = event["Records"][0]
bucket_name = str(file_obj["s3"]["bucket"]["name"])
file_name = str(file_obj["s3"]["object"]["key"])
s3_uri = create_uri(bucket_name, file_name)
file_type = file_name.split("2019.")[1]
job_name = file_name
transcribe.start_transcription_job(TranscriptionJobName=job_name,
Media ={"MediaFileUri": s3_uri},
MediaFormat = file_type,
LanguageCode = "en-US",
Settings={
"VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
"ShowSpeakerLabels": True,
"MaxSpeakerLabels": 4
})
while True:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED", "FAILED"]:
transcribe.delete_transcription_job(TranscriptionJobName=job_name)
break
print("It's in progress")
time.sleep(5)
load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
load_json = json.dumps(json.load(load_url))
s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)
# TODO implement
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
def create_uri(bucket_name, file_name):
return "s3://"+bucket_name+"/"+file_name