Question

NodeJS app使用ffmpeg从mp3＆amp;创建ogg文件MP4。如果源文件是宽带，则Watson Speech to Text接受该文件没有问题。如果源文件是窄带，则Watson Speech to Text无法读取ogg文件。我已经测试了ffmpeg的输出，而窄带ogg文件与mp3文件具有相同的音频内容（例如，我可以收听并听到相同的人）。是的，事先，我正在更改对Watson的调用以正确指定模型和content_type。代码如下：

exports.createTranscript = function(req, res, next)
{ var _name = getNameBase(req.body.movie);
  var _type = getType(req.body.movie);
  var _voice = (_type == "mp4") ? "en-US_BroadbandModel" : "en-US_NarrowbandModel" ;
  var _contentType = (_type == "mp4") ? "audio/ogg" : "audio/basic" ;
  var _audio = process.cwd()+"/HTML/movies/"+_name+'ogg';
  var transcriptFile = process.cwd()+"/HTML/movies/"+_name+'json';

  speech_to_text.createSession({model: _voice}, function(error, session) {
    if (error) {console.log('error:', error);}
    else
      {
        var params = { content_type: _contentType, continuous: true,
         audio: fs.createReadStream(_audio),
          session_id: session.session_id
          };
          speech_to_text.recognize(params, function(error, transcript) {
            if (error) {console.log('error:', error);}
            else
              { fs.writeFile(transcriptFile, JSON.stringify(transcript), function(err) {if (err) {console.log(err);}});
                res.send(transcript);
              }
          });
      }
  });
}

_type是mp3（手机录音的窄带）或mp4（宽带）已跟踪model: _voice以确保正确设置已跟踪content_type: _contentType以确保正确设置

使用窄带设置提交给Speech to Text的任何ogg文件都失败，Error: No speech detected for 30s.使用两个真正的窄带文件进行测试，并要求Watson将宽带ogg文件（从mp4创建）读取为窄带。相同的错误消息。我错过了什么？

Answer 1

Watson Speech to Text的文档在这一点上令人困惑。文档here表示在使用窄带模型时，content_type应设置为audio/basic。那是不对的。在此示例中，入站音频文件是窄带文件，但它是ogg文件，因此content_type仍应为audio/ogg。这一改变解决了这个问题。

Watson NarrowBand Speech to Text不接受ogg文件

1 个答案: