Question

使用Watson-Unity-SDK可以区分说话者/用户，因为它似乎能够返回一个数组，该数组标识在多人交流中哪个说话者说了哪些单词，但我无法确定详细说明如何执行它，尤其是在我向助手服务发送不同讲话（由不同人说）的情况下。

这里有用于解析Assistant的json输出/响应以及OnRecognize和OnRecognizeSpeaker以及SpeechRecognitionResult和SpeakerLabelsResult的代码段，但是我该怎么办让Watson在识别出话语并提取其意图后从服务器返回此信息？

OnRecognize和OnRecognizeSpeaker在Active属性中仅使用一次，因此它们都被调用，但只有OnRecognize进行语音转文本（转录），并且从未触发OnRecognizeSpeaker ...

public bool Active
    {
        get
        {
            return _service.IsListening;
        }
        set
        {
            if (value && !_service.IsListening)
            {
                _service.RecognizeModel = (string.IsNullOrEmpty(_recognizeModel) ? "en-US_BroadbandModel" : _recognizeModel);
                _service.DetectSilence = true;
                _service.EnableWordConfidence = true;
                _service.EnableTimestamps = true;
                _service.SilenceThreshold = 0.01f;
                _service.MaxAlternatives = 0;
                _service.EnableInterimResults = true;
                _service.OnError = OnError;
                _service.InactivityTimeout = -1;
                _service.ProfanityFilter = false;
                _service.SmartFormatting = true;
                _service.SpeakerLabels = false;
                _service.WordAlternativesThreshold = null;
                _service.StartListening(OnRecognize, OnRecognizeSpeaker);
            }
            else if (!value && _service.IsListening)
            {
                _service.StopListening();
            }
        }
    }

通常，Assistant的输出（即其结果）类似于以下内容：

Response: {"intents":[{"intent":"General_Greetings","confidence":0.9962662220001222}],"entities":[],"input":{"text":"hello eva"},"output":{"generic":[{"response_type":"text","text":"Hey!"}],"text":["Hey!"],"nodes_visited":["node_1_1545671354384"],"log_messages":[]},"context":{"conversation_id":"f922f2f0-0c71-4188-9331-09975f82255a","system":{"initialized":true,"dialog_stack":[{"dialog_node":"root"}],"dialog_turn_counter":1,"dialog_request_counter":1,"_node_output_map":{"node_1_1545671354384":{"0":[0,0,1]}},"branch_exited":true,"branch_exited_reason":"completed"}}}

我已经设置了intents和entities，并且此列表由Assistant服务返回，但是我不确定如何将其也考虑到我的实体或如何使其响应因此，当STT识别出不同的说话者时。

我希望能获得一些帮助，尤其是如何通过Unity脚本实现此目的。

Answer 1

我对处理助手的消息有完全相同的问题，所以我研究了curl $url -X PUT -H 'content-type: application/json' -H 'accept: application/json' -d '{"admins":{"names":["superuser"],"roles":["admins", "test"]}}'方法，该方法返回的字符串类似于Assistant.OnMessage()和“Response: {0}”, customData[“json”].ToString()的输出，类似于这个：

JSON

我个人解析[Assistant.OnMessage()][DEBUG] Response: {“intents”:[{“intent”:”General_Greetings”,”confidence”:1}],”entities”:[],”input”:{“text”:”hello”},”output”:{“text”:[“good evening”],”nodes_visited”: etc...}以便从JSON中提取内容。在上面的示例中，您可以看到该数组为空，但是，如果要填充它，则需要从中提取值，然后在代码中可以执行所需的操作。

关于不同的说话人识别，在您包含其代码的messageResponse.Entities属性中，Active行同时处理了这两者，因此也许在其代码块中放置一些_service.StartListening(OnRecognize, OnRecognizeSpeaker)语句以看看它们是否被调用。

Answer 2

请将SpeakerLabels设置为True

_service.SpeakerLabels = true;

助理实体和不同的发言人

2 个答案: