Question

一些背景知识：在尝试构建单元选择语音时，我按照此处的步骤操作：https://github.com/CSTR-Edinburgh/CSTR-Edinburgh.github.io/blob/master/_posts/2016-8-21-Multisyn_unit_selection.md并使用此处的语音定义：https://raw.githubusercontent.com/CSTR-Edinburgh/merlin/master/egs/hybrid_synthesis/s1/voice_definition_files/unit_selection/cstr_us_awb_arctic_multisyn.scm。不幸的是，wavs太吵了，所以我最后用手标记它们并跳过自动贴标签过程。

现在声音还可以，但仍需要一些工作。不断出现的一个错误是节日报道＆＃34;缺少双音素＆＃34;对于电话转换的任何暂停，例如：

festival> (utt.relation.print (SayText "I can say anything I want.") 'Unit)
Missing diphone: #_ay
 diphone still missing, backing off: #_ay
 backed off: #_ay -> #_ax
 diphone still missing, backing off: #_ax
 backed off: #_ay -> #_#
 diphone still missing, backing off: #_#
 backed off: #_ay ->
Missing diphone: ey_eh
 Interword so inserting silence.
 diphone still missing, backing off: ey_#
 backed off: ey_eh -> ax_#
 diphone still missing, backing off: ax_#
 backed off: ey_eh -> #_#
 diphone still missing, backing off: #_#
 backed off: ey_eh ->
Missing diphone: #_eh
 diphone still missing, backing off: #_eh
 backed off: #_eh -> #_ax
 diphone still missing, backing off: #_ax
 backed off: #_eh -> #_#
 diphone still missing, backing off: #_#
 backed off: #_eh ->
Missing diphone: t_#
 diphone still missing, backing off: t_#
 backed off: t_# -> #_#
 diphone still missing, backing off: #_#
 backed off: t_# ->

我尝试使用sil和sp替换标签中的pau和h#（来自自动流程）（以便与festival /中使用的沉默相对应） lib / radio_phones.scm），我也尝试用#替换它们，但这并没有改变任何东西。源wav / labs肯定包含上面的转换（例如，几个以＆＃34开头;我可以＆＃34;）但是节日似乎永远不会使用这些。

如何让电影节使用暂停来电话转换源数据？

谢谢！

Answer 1

当我运行基于Multisyn单元选择的脚本时，build_utts部分失败并跳过，因为手工标记的标签与Festival预测的完全不匹配。例如，如果说话者说过极端＆＃34;如eh k s ...但是，Festival会计算ih k s ... build_utts脚本会失败并出现如下错误：

align missmatch at ih (0.000000) eh (2.810566)

我为每个话语手动运行build_utts脚本并相应地调整了标签。如果像我一样，你愚蠢地尝试给自己贴上一些帮助我的技巧：

考虑删除t_cl或d_cl之类的任何手机关闭，因为当它尝试匹配时，这些手机关闭可能会搞砸
确保每个话语的开头和结尾都有一个暂停（即#），因为build_utts脚本不会抱怨它，但在节日中运行语音时你会得到像：
这样的错误
```
        -=-=-=-=-=- EST Error -=-=-=-=-=-
        {FND} Feature end not defined

        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
```

感谢@NikolayShmyrev指出我正确的方向。他还建议使用Ossian而不是使用python的Festival而不是Festival的相当困难的代码。

节日单元选择语音缺少双音素：#hash

1 个答案: