Question

我试图对齐简单文字。以下是文本和音频文件的链接：
http://s000.tinyupload.com/?file_id=48044768133759453374
http://s000.tinyupload.com/?file_id=99891199139563396901

以下是配置设置：

private static final String ACOUSTIC_MODEL_PATH =
        "resource:/edu/cmu/sphinx/models/en-us/en-us";
private static final String DICTIONARY_PATH =
        "resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict";

我得到的输出如下（省略号由我添加）：

- ï
- ¿in
  a                         [11250:11330]
  standard                  [11330:11920]
  shopping                  [11920:12440]
  centre                    [12440:13020]
- you
  can                       [13380:13730]
  ...
  shops                     [15170:15790]
- you
  can                       [16620:16890]
  buy                       [16890:17140]
  ...
  and                       [26920:27230]
  suits                     [27190:27220]
- thereâ€™s
  a                         [29160:29210]
  sportswear                [29210:29980]
  ...
  clothes                   [33330:33360]
- t-shirts
  shorts                    [35560:36320]
  jumpers                   [36630:37410]
  ...
  for                       [41860:42010]

正如您所看到的那样：

在第一个in

a

you
没有识别there's，而是将其识别为thereâ€™s
没有短划线的时间，例如t-shirts

有没有什么方法可以配置sphinx为出现时间提供时间？

Answer 1

一些评论

在第一次
之前没有认出来

您的文本文件具有对齐器未知的BOM标记。最好在对齐之前将其删除

没有认出那里，而是认定它是那里的

您的文字使用了对齐器未知的UTF-8撇号。您最好将它们转换为ASCII等价物

没有破折号的时间，比如T恤

字典中缺少这些单词。您可以在对齐之前将它们添加到字典中，或指定g2p模型将它们转换为语音。

sphinx-4 aligner跳过简单的单词，比如`you`，`in`和带有破折号的单词 - 为什么？

1 个答案: