Watson STT Java - Websockets Java和HTTP POST之间的结果不同

时间:2016-04-08 16:37:54

标签: java api ibm-cloud speech-to-text ibm-watson

我正在尝试构建一个采用流式音频输入的应用程序(例如:麦克风中的一行),并使用IBM Bluemix(Watson)进行语音转文本。

我简要地修改了here找到的示例Java代码。这个例子发送了一个WAV,但我发送了一个FLAC ......这[应该]是无关紧要的。

结果糟糕,非常糟糕。这是我在使用Java Websockets代码时得到的结果:

{
  "result_index": 0,
  "results": [
    {
      "final": true,
      "alternatives": [
        {
          "transcript": "it was six weeks ago today the terror ",
          "confidence": 0.92
        }
      ]
    }
  ]
}

现在,将上述结果与下面的结果进行比较。这些是发送相同内容但使用cURL(HTTP POST)时的结果:

{
   "results": [
  {
     "alternatives": [
        {
           "confidence": 0.945,
           "transcript": "it was six weeks ago today the terrorists attacked the U. S. consulate in Benghazi Libya now we've obtained email alerts that were put out by the state department as the attack unfolded as you know four Americans were killed including ambassador Christopher Stevens "
        }
     ],
     "final": true
  },
  {
     "alternatives": [
        {
           "confidence": 0.942,
           "transcript": "sharyl Attkisson has our story "
        }
     ],
     "final": true
  }
   ],
   "result_index": 0
}

这是一个几乎完美的结果。

为什么使用Websockets时有所不同?

1 个答案:

答案 0 :(得分:2)

此问题已在3.0.0-RC1版本中修复。

您可以从以下地址获取新jar:

  1. 的Maven

    <dependency>
        <groupId>com.ibm.watson.developer_cloud</groupId>
        <artifactId>java-sdk</artifactId>
        <version>3.0.0-RC1</version>
    </dependency>
    
  2. 摇篮

    'com.ibm.watson.developer_cloud:java-sdk:3.0.0-RC1'
    
  3. JAR

    下载jar-with-dependencies(~1.4MB)

  4. 以下是如何使用WebSockets识别flac音频文件的示例

    SpeechToText service = new SpeechToText();
    service.setUsernameAndPassword("<username>", "<password>");
    
    FileInputStream audio = new FileInputStream("path-to-audio-file.flac");
    
    RecognizeOptions options = new RecognizeOptions.Builder()
      .continuous(true)
      .interimResults(true)
      .contentType(HttpMediaType.AUDIO_FLAC)
      .build();
    
    service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() {
      @Override
      public void onTranscription(SpeechResults speechResults) {
        System.out.println(speechResults);
      }
    });
    

    }

    要测试的FLAC文件:https://s3.amazonaws.com/mozart-company/tmp/4.flac

    注意: 3.0.0-RC1发布候选版本。我们将在下周(3.0.1)进行制作发布。