Question

我正在python中构建Amazon Echo的应用程序。当我讲一个亚马逊Echo不认识的坏话语时，我的技能退出并将我带回主屏幕。我希望防止这种情况并重复Amazon Echo刚刚发出的声音。

为了尝试在某种程度上实现这一点，我尝试在会话结束或检测到错误输入时调用函数来说出某些内容。

def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

但是，我刚收到Echo的错误 - 此函数永远不会输入on_session_ended。

那么我如何在Amazon Echo上进行错误捕获和处理？

更新1：我将自定义插槽的话语数和意图数减少为1。现在用户应该只说A，B，C或D.如果他们在此之外说任何话，那么意图仍然被触发但没有插槽值。因此，我可以根据槽值是否存在来进行一些错误检查。但是，这似乎不是最好的方法。当我尝试添加没有插槽和相应话语的意图时，任何与我的意图都不匹配的内容都默认为这个新意图。我该如何解决这些问题？

更新2：以下是我的代码的一些相关部分。

意图处理程序：

def lambda_handler(event, context):
    print("Python START -------------------------------")
    print("event.session.application.applicationId=" +
          event['session']['application']['applicationId'])

    if event['session']['new']:
        on_session_started({'requestId': event['request']['requestId']},
                           event['session'])

    if event['request']['type'] == "LaunchRequest":
        return on_launch(event['request'], event['session'])
    elif event['request']['type'] == "IntentRequest":
        return on_intent(event['request'], event['session'])
    elif event['request']['type'] == "SessionEndedRequest":
        return on_session_ended(event['request'], event['session'])


def on_session_started(session_started_request, session):
    print("on_session_started requestId=" + session_started_request['requestId']
          + ", sessionId=" + session['sessionId'])


def on_launch(launch_request, session):
    """ Called when the user launches the skill without specifying what they want """
    print("on_launch requestId=" + launch_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    # Dispatch to your skill's launch
    return create_new_user()


def on_intent(intent_request, session):
    """ Called when the user specifies an intent for this skill """

    print("on_intent requestId=" + intent_request['requestId'] +
          ", sessionId=" + session['sessionId'])

    intent = intent_request['intent']
    intent_name = intent['name']
    attributes = session["attributes"] if 'attributes' in session else None
    intent_slots = intent['slots'] if 'slots' in intent else None

    # Dispatch to skill's intent handlers

    # TODO : Authenticate users
    #   TODO : Start session in a different spot depending on where user left off

    if intent_name == "StartQuizIntent":
        return create_new_user()

    elif intent_name == "AnswerIntent":
        return get_answer_response(intent_slots, attributes)

    elif intent_name == "TestAudioIntent":
        return get_audio_response()

    elif intent_name == "AMAZON.HelpIntent":
        return get_help_response()

    elif intent_name == "AMAZON.CancelIntent":
        return get_session_end_response()

    elif intent_name == "AMAZON.StopIntent":
        return get_session_end_response()

    else:
        return get_session_end_response()


def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

然后我们有实际被调用的函数和响应构建器。我已经编辑了一些隐私代码。我还没有构建所有显示响应文本字段并且有一些硬编码的uid所以我不必担心身份验证。

# --------------- Functions that control the skill's behavior ------------------

####### GLOBAL SETTINGS ########
utility_background_image = "https://i.imgur.com/XXXX.png"


def get_welcome_response():
    """ Returns the welcome message if a user invokes the skill without specifying an intent """
    session_attributes = {}
    card_title = ""
    speech_output = ("Hello and welcome ... quiz .... blah blah ...")
    reprompt_text = "Ask me to start and we will begin the test!"
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_session_end_response():
    """ Returns the ending message if a user errs or exits the skill """
    session_attributes = {}
    card_title = ""
    speech_output = "Thank you for your time!"
    reprompt_text = ''
    should_end_session = True

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_audio_response():
    """ Tests the audio capabilities of the echo """
    session_attributes = {}
    card_title = ""  # TODO : keep no 'welcome'?
    speech_output = ""
    reprompt_text = ""
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session, build_audio_response()))


def create_new_user():
    """ Creates a new user that the server will recognize and whose action will be stored in db """
    url = "http://XXXXXX:XXXX/create_user"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    uuid = data["uuid"]
    return ask_question(uuid)


def query_server(uuid):
    """ Requests to get num_questions number of questions from the server """
    url = "http://XXXXXXXX:XXXX/get_question_json?uuid=%s" % (uuid)  # TODO : change needs to to be uuid
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))

    if data["status"]:
        question = data["data"]["question"]
        quid = data["data"]["quid"]
        next_quid = data["data"]["next_quid"]  # TODO : will we need any of this?
        topic = data["data"]["topic"]
        type = data["data"]["type"]
        media_type = data["data"]["media_type"]  # either 'IMAGE', 'AUDIO', or 'VIDEO'
        answers = data["data"]["answer"]  # list of answers stored in order they should be spoken
        images = data["data"]["image"]  # list of images that correspond to order of answers list
        audio = data["data"]["audio"]
        video = data["data"]["video"]

        question_data = {"status": True, "data":{"question": question, "quid": quid, "answers": answers,
                         "media_type": media_type, "images": images, "audio": audio, "video": video}}
        if next_quid is "None":
            return None
        return question_data
    else:
        return {"status": False}


def ask_question(uuid):
    """ Returns a quiz question to the user since they specified a QuizIntent """
    question_data = query_server(uuid)

    if question_data is None:
        return get_session_end_response()

    card_title = "Ask Question"
    speech_output = ""
    session_attributes = {}
    should_end_session = False
    reprompt_text = ""

    # visual responses
    display_title = ""
    primary_text = ""
    secondary_text = ""

    images = []
    answers = []

    if question_data["status"]:
        session_attributes = {
            "quid": question_data["data"]["quid"],
            "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24",
            "question_start_time": time.time()
        }

        question = question_data["data"]["question"]
        answers = question_data["data"]["answers"]  # answers are shuffled when pulled from server
        images = question_data["data"]["images"]
        # TODO : consider different media types

        speech_output += question
        reprompt_text += ("Please choose an answer using the official NATO alphabet. For example," +
                          " A is alpha, B is bravo, and C is charlie.")

    else:
        speech_output += "Oops! This is embarrassing. There seems to be a problem with the server."
        reprompt_text += "I don't exactly know where to go from here. I suggest restarting this skill."

    return build_response(session_attributes, build_speechlet_response(card_title, speech_output,
            reprompt_text, should_end_session,
             build_display_response_list_template2(title=question, image_urls=images, answers=answers)))


def send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given):
    """ Sends the users responses back to the server to be stored in the database """
    url = ("http://XXXXXXXX:XXXX/send_answers?uuid=%s&quid=%s&time=%s&answer_given=%s" %
          (uuid, quid, time_used_for_question, answer_given))
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    return data["status"]


def get_answer_response(slots, attributes):
    """ Returns a correct/incorrect message to the user depending on their AnswerIntent """

    # get time, quid, and uuid from attributes
    question_start_time = attributes["question_start_time"]
    quid = attributes["quid"]
    uuid = attributes["uuid"]

    # get answer from slots
    try:
        answer_given = slots["Answer"]["value"].lower()
    except KeyError:
        return get_session_end_response()

    # calculate a rough estimate of the time it took to answer question
    time_used_for_question = str(int(time.time() - question_start_time))

    # record response data by sending it to the server
    send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given)

    return ask_question(uuid)


def get_help_response():
    """ Returns a help message to the user since they called AMAZON.HelpIntent """
    session_attributes = {}
    card_title = ""
    speech_output = "" # TODO
    reprompt_text = "" # TODO
    should_end_session = False

    return build_response(session_attributes,
            build_speechlet_response(card_title, speech_output, reprompt_text, should_end_session,
             build_display_response(utility_background_image, card_title)))


# --------------- Helpers that build all of the responses ----------------------


def build_hint_response(hint):
    """
    Builds the hint response for a display.

    For example, Try "Alexa, play number 1" where "play number 1" is the hint.
    """
    return {
        "type": "Hint",
        "hint": {
            "type": "RichText",
            "text": hint
        }
    }


def build_display_response(url='', title='', primary_text='', secondary_text='', tertiary_text=''):
    """
    Builds the display template for the echo show to display.

    Echo show screen is 1024px x 600px

    For additional image size requirements, see the display interface reference.
    """
    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "BodyTemplate1",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question",
                "sources": [
                    {
                        "url": url
                    }
                ]
            },
            "textContent": {
                "primaryText": {
                    "type": "RichText",
                    "text": primary_text
                },
                "secondaryText": {
                    "type": "RichText",
                    "text": secondary_text
                },
                "tertiaryText": {
                    "type": "RichText",
                    "text": tertiary_text
                }
            }
        }
    }]


def build_list_item(url='', primary_text='', secondary_text='', tertiary_text=''):
    return {
        "token": "question_item",
        "image": {
            "sources": [
                {
                    "url": url
                }
            ],
            "contentDescription": "Question Image"
        },
        "textContent": {
            "primaryText": {
                "type": "RichText",
                "text": primary_text
            },
            "secondaryText": {
                "text": secondary_text,
                "type": "PlainText"
            },
            "tertiaryText": {
                "text": tertiary_text,
                "type": "PlainText"
            }
        }
    }


def build_display_response_list_template2(title='', image_urls=[], answers=[]):
    list_items = []
    for image, answer in zip(image_urls, answers):
        list_items.append(build_list_item(url=image, primary_text=answer))

    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "ListTemplate2",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question Background",
                "sources": [
                    {
                        "url": "https://i.imgur.com/HkaPLrK.png"
                    }
                ]
            },
            "listItems": list_items
        }
    }]


def build_audio_response(url): # TODO add a display repsonse here as well
    """ Builds audio response. I.e. plays back an audio file with zero offset """
    return [{
        "type": "AudioPlayer.Play",
        "playBehavior": "REPLACE_ALL",
        "audioItem": {
            "stream": {
                "token": "audio_clip",
                "url": url,
                "offsetInMilliseconds": 0
            }
        }
    }]


def build_speechlet_response(title, output, reprompt_text, should_end_session, directive=None):
    """ Builds speechlet response and puts display response inside """
    return {
        'outputSpeech': {
            'type': 'PlainText',
            'text': output
        },
        'card': {
            'type': 'Simple',
            'title': title,
            'content': output
        },
        'reprompt': {
            'outputSpeech': {
                'type': 'PlainText',
                'text': reprompt_text
            }
        },
        'shouldEndSession': should_end_session,
        'directives': directive
    }


def build_response(session_attributes, speechlet_response):
    """ Builds the complete response to send back to Alexa """
    return {
        'version': '1.0',
        'sessionAttributes': session_attributes,
        'response': speechlet_response
    }

更新3：我更新了意图，因此现在有一个自定义插槽需要一个自定义插槽，然后我有另一个不带插槽的自定义意图。这些自定义意图也有自己的样本话语。下面列出了意图和他们的话语。当我开始技能时，它工作正常。然后，当我说/＃＆＃34;动物园动物园动物园＆＃34;为了测试错误的输入，我收到一个错误。对动物园动物园动物园的要求和＃34;并且响应列在下面。我正在寻找一种很好的方法来捕获这个错误的输入错误并恢复/恢复技能回到以前的状态。

意图：

...
{
      "intent": "TestAudioIntent"
},
{
      "slots": [
        {
          "name": "Answer",
          "type": "LETTER"
        }
      ],
      "intent": "AnswerIntent"
},
...

示例话语：

AnswerIntent {Answer}
AnswerIntent I think it is {Answer}
TestAudioIntent test the audio

示例JSON请求：

{
  "session": {
    "new": false,
    "sessionId": "SessionId.574f0b74-be17-4f79-bbd6-ce926a1bf856",
    "application": {
      "applicationId": "XXXXXXXX"
    },
    "attributes": {
      "quid": "7fa9fcbf-35db-4bbd-ac73-37977bcef563",
      "question_start_time": 1515691612.7381804,
      "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24"
    },
    "user": {
      "userId": "XXXXXXXX"
    }
  },
  "request": {
    "type": "IntentRequest",
    "requestId": "EdwRequestId.23765cb0-f327-4f52-a9a3-b9f92a375a5f",
    "intent": {
      "name": "TestAudioIntent",
      "slots": {}
    },
    "locale": "en-US",
    "timestamp": "2018-01-11T17:26:57Z"
  },
  "context": {
    "AudioPlayer": {
      "playerActivity": "IDLE"
    },
    "System": {
      "application": {
        "applicationId": "XXXXXXXX"
      },
      "user": {
        "userId": "XXXXXXXX"
      },
      "device": {
        "supportedInterfaces": {
          "Display": {
            "templateVersion": "1",
            "markupVersion": "1"
          }
        }
      }
    }
  },
  "version": "1.0"
}

我得到以下测试错误作为回复：

The remote endpoint could not be called, or the response it returned was invalid.

Answer 1

我最终做的是使用与Amazon's dialogue管理系统类似的东西。如果用户说某些内容没有填充插槽，我会使用该问题重新提示他们。我的目标是在每次讲话后记录用户的陈述/答案，因此我没有使用内置的对话管理。另外，我使用Amazon's slot synonyms用于我的所有插槽，以使我的模态更加健壮。

我仍然不知道这是最好的方式，但它是一个起点，似乎是在工作O.K ....

Amazon Echo如何捕获错误？

1 个答案: