与使用AWS Transcribe进行通道分离有关的代码错误

时间:2019-10-28 15:38:08

标签: python amazon-web-services aws-lambda aws-transcribe

我正在尝试开发一个lambda,它将对AWS Transcribe .JSON输出文件进行排序。此lambda会删除所有不必要的数据,并将各个频道的所有单词合并在一起。

我一直在跟一个叫Srce Cde的人一起学习本教程,他创建了这样的lambda,但是使用扬声器分离而不是通道分离。

这是针对我正在使用AWS进行的音频分析项目。呼叫被上传到S3存储桶,lambda启动“转录”作业,“转录”输出进入另一个S3存储桶。然后,此S3存储桶会触发最终的lambda,该lambda会将通道与通话分开。

我尝试了以下代码:

import json
import boto3

def lambda_handler(event, context):
    if event:
        s3 = boto3.client("s3")
        s3_object = event["Records"][0]["s3"]
        bucket_name = s3_object["bucket"]["name"]
        file_name = s3_object["object"]["key"]
        file_obj = s3.get_object(Bucket=bucket_name, Key=file_name)
        transcript_result = json.loads(file_obj["Body"].read())

        segments = transcript_result["results"]["channel_labels"]
        items = transcript_result["results"]["items"]

        speaker_text = []
        flag = False
        speaker_json = {}
        for no_of_speaker in range(segments["channels"]):
            for word in items:
                for seg in segments["items"]:
                    if seg["channel_label"] == "ch_"+str(no_of_speaker):
                        end_time = seg["end_time"]
                        if "start_time" in word:
                            if seg["items"]:
                                for seg_item in seg["items"]:
                                    if word["end_time"] == seg_item["end_time"] and word["start_time"] == seg_item["start_time"]:
                                        speaker_text.append(word["alternatives"][0]["content"])
                                        flag = True
                        elif word["type"] == "punctuation":
                            if flag and speaker_text:
                                temp = speaker_text[-1]
                                temp += word["alternatives"][0]["content"]
                                speaker_text[-1] = temp
                                flag = False
                                break

            speaker_json["ch_"+str(no_of_speaker)] = ' '.join(speaker_text)
            speaker_text = []
    print(speaker_json)
    s3.put_object(Bucket="aws-mrp-speaker-separation", Key=file_name, Body=json.dumps(speaker_json))

    return {
        'statusCode': 200,
        'body': json.dumps('Speaker transcript seperated successfully!')
}

我期望输出将成绩单按通道划分。但是,出现以下错误:

Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 19, in lambda_handler
    for no_of_speaker in range(segments["channels"]):```

0 个答案:

没有答案