Tensorflow Android语音识别示例

时间:2018-03-14 03:58:23

标签: android tensorflow speech-recognition

我正在研究tensorflow语音命令示例。 我使用的Android代码库在tensorflow GitHub android sample上是相同的,主要关注SpeechActivity.javaRecognizeCommands.java。除了记录消息之外,我没有更改任何内容。

据我所知,

(1)SpeechActivity.java会将模型引用结果(outputScores)和currentTime传递给recognizeCommands.processLatestResults以做后顺畅。

// Run the model.
inferenceInterface.feed(SAMPLE_RATE_NAME, sampleRateList);
inferenceInterface.feed(INPUT_DATA_NAME, floatInputBuffer, RECORDING_LENGTH, 1);
inferenceInterface.run(outputScoresNames);
inferenceInterface.fetch(OUTPUT_SCORES_NAME, outputScores);

// Use the smoother to figure out if we've had a real recognition event.
long currentTime = System.currentTimeMillis();
final RecognizeCommands.RecognitionResult result =
recognizeCommands.processLatestResults(outputScores, currentTime);

(2)在ProcessLatestResults()中,previousResults用于存储最近500毫秒(averageWindowDurationMs == 500)推断的输出分数,averageScores将是最终分数我们希望/将来使用。

// Add the latest results to the head of the queue.
previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

// Prune any earlier results that are too old for the averaging window.
final long timeLimit = currentTimeMS - averageWindowDurationMs;
while (previousResults.getFirst().first < timeLimit) {
  previousResults.removeFirst();
}

...

// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

我的问题/问题是

(1)当for循环计算平均值时,从每个项目读取的previousResult.second值是相同的。但是,这是不可能的。我的问题是我是否遗漏了记录信息的内容,以便打印出错误的previousResult.second值?或那些得分阵列真的相同? 这是我记录的方式:

Log.d("tmp", "start average");
// Calculate the average score across all the results in the window.
float[] averageScores = new float[labelsCount];
for (Pair<Long, float[]> previousResult : previousResults) {
  final float[] scoresTensor = previousResult.second;
  Log.d("tmp", "previousResult("+previousResult.first+"): ["+Arrays.toString(previousResult.second)+"]" );
  int i = 0;
  while (i < scoresTensor.length) {
    averageScores[i] += scoresTensor[i] / howManyResults;
    ++i;
  }
}

这是两次平均循环过程中的日志消息。第一次,previousResults中的得分数组是相同的[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225],这是不可能的。

start avarage
previousResult(1520998400247): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400301): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400354): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400408): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400466): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400520): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400574): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400629): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400683): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
previousResult(1520998400737): [[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225]]
....
....
start average
previousResult(1520998400301): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400354): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400408): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400466): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400520): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400574): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400629): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400683): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400737): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
previousResult(1520998400791): [[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786]]
...
...

(2)根据消息日志,您可以看到第一次与1520998400301对应的分数数组为[0.16993265, 0.15456167, 0.027866788, 0.107177936, 0.12646474, 0.053816866, 0.082612425, 0.059116375, 0.038425073, 0.06992877, 0.033074524, 0.07702225],但在下一次,分数数组变为[0.14775836, 0.18298364, 0.026629224, 0.12195902, 0.111195154, 0.058891248, 0.07295453, 0.05453651, 0.04063993, 0.06559348, 0.032576166, 0.084282786] }

我的第二个问题是我不知道这怎么可能发生。我的代码与RecognizeCommands.java相同。任何线索或建议都会非常有帮助,谢谢。

1 个答案:

答案 0 :(得分:0)

尝试更改此代码:

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, currentResults));

如下:

  // Add the latest results to the head of the queue.
    previousResults.addLast(new Pair<Long, float[]>(currentTimeMS, 
                            Arrays.copyOf(currentResults, currentResults.length)));