自动调谐

Question

使用内置的W3 SpeechSynthesis API，是否可以使用JavaScript创建that of GLaDOS等调制语音？

我目前拥有的代码（下方）split()将要合成的单词组成单个单词，每个单词都以随机确定的音高发出，以尝试创建所需的效果。
在说出的单词和调制之间的休息时间太长了。

＆＃13;

if ('speechSynthesis' in window) {
    var speechSynthesis = window.speechSynthesis;
    speechSynthesis.onvoiceschanged = function () {
        console.log(speechSynthesis)
        var phrase = "Hello, I am GLaDOS";
        var parts = phrase.split(" ");
        for(var i in parts){
            var word = parts[i];
            var text = new SpeechSynthesisUtterance(word);
            text.voice = speechSynthesis.getVoices[2]; //English Female Voice
            text.rate=1.2;
            text.pitch = Math.random()*.5+1.50;
            speechSynthesis.speak(text);
        }
    }
}

＆＃13;

注意： GLaDOS ＆＃34; a fictional artificially intelligent computer system from the video game series Portal.＆＃34;

Answer 1

自动调谐

这个答案主要是意见，善意提供以帮助他们。

不幸的是，我相信，对于GLaDOS的声音的影响必须通过好莱坞所谓的“Post-Production”来实现，即它必须是一种后效;语音需要在发出后进行处理。

调整音高和/或速率将始终只影响整个话语的输出，一旦调用SpeechSynthesis.speak()，其结果似乎就会显示出来。
你想把话语分成单词，并用随机的音高和速率输出每个单词是聪明的，这可能就像我们在没有后处理的情况下得到的那样接近。

顺便说一下：我不知道我们是否可以将后处理应用于话语，并怀疑如果它完全可能，则需要某种形式的浏览器扩展/插件。

然而，如果音高和速率是有效的，那么几种声音（我强烈怀疑声音的可用性会以相对不可预测的方式发生变化）可以被强制Auto-Tune设置恰到好处（取决于语音）。

话虽如此，我不能提供超出标准界面的任何东西但是我注意到您发布的代码示例有几个问题，如果您打算进行试验，应该解决这个问题。

特别是您的代码：

text.voice = speechSynthesis.getVoices[2];

不会选择第三个可用语音。如果您尝试更改索引值，您会注意到语音没有改变，因为window.speechSynthesis.getVoices返回的功能不是函数的返回。
你可以这样做：

text.voice = speechSynthesis.getVoices()[2];

但这有点冒昧。

SpeechSynthesis.getVoices()

...返回表示当前设备上所有可用语音的SpeechSynthesisVoice个对象列表。

和（在Chrome和Edge†中）SpeechSynthesis.onvoiceschanged

...将在SpeechSynthesisVoice方法返回的SpeechSynthesis.getVoices()对象列表发生更改时（voiceschanged事件触发时）运行。

当在服务器端进行语音合成并且正在异步确定语音列表时，或者在语音合成应用程序运行时安装/卸载客户端语音时，可能会发生这种情况。

因此，触发可以†用于启动语音列表的编辑，可以然后访问：

var voices = SpeechSynthesis.getVoices();
utterance.voice = voices[ 2 ];

†请注意，Firefox目前不支持它，只会在SpeechSynthesis.getVoices()被触发时返回声音列表。但是，对于Chrome，您必须等待事件在填充列表之前触发...

“欢迎来到孔径科学计算机辅助浓缩中心”

虽然下面的代码不会提供自动调整的GLaDOS语音神奇，但它将提供一种更简单的方法来尝试各种可能性，并演示正确到访问可用的声音。

已插入更新（2019年1月6日）：

在编写下面的代码时，谷歌的默认声音是女性，因此效果比现在听到的要好。这些API及其相关资源和服务可能会发生变化。

另请注意，Chrome since v70中的SpeechSynthesis.speak()（其他浏览器可能会跟随）会“在文档未收到用户激活时立即触发错误”。

结束更新。

if ( window.hasOwnProperty( "speechSynthesis" ) ) {
  var opt, voices, utterance;
  const speechSynth = window.speechSynthesis,
        form = document.querySelector( "form" ),
        playSample = () => {
          if ( speechSynth.speaking ) {
            speechSynth.cancel();
            // doesn't work as expected with default voice on Chrome on Windows
          }
          utterance = new SpeechSynthesisUtterance( form.sample.value );
          utterance.voice = voices[ form.voice.selectedIndex ];
          utterance.volume = form.volume.valueAsNumber * 0.01;
          utterance.pitch = form.pitch.valueAsNumber * 0.01;
          utterance.rate = form.rate.valueAsNumber * 0.01;
          speechSynth.speak( utterance );
        },
        init = () => {
          if ( !voices ) { // fixes triple trigger weirdness
            voices = speechSynth.getVoices();
            voices.forEach( ( v ) => {
              opt = document.createElement( "option" );
              opt.textContent = v.name;
              if ( v.name === "Google US English" ) {
                opt.selected = true;
                form.rate.value = 65;
              }
              form.voice.appendChild( opt );
            } );
            form.addEventListener( "input", playSample, false );
            form.play.addEventListener( "click", playSample, false );
            playSample();
          }
        };
  if ( speechSynth.onvoiceschanged !== undefined ) {
    // Only Chrome and Edge at time of posting
    speechSynth.onvoiceschanged = init;
  } else {
    init();
  }
}

div {
  height: 50vh;
  float: left;
  margin-right: 1rem;
}
label {
  display: block;
}
input {
  vertical-align: middle;
}
[name=sample] {
  width: 90vw;
  margin-bottom: 1rem;
}
[name=play] {
  margin-top: 1rem;
}

<form>
  <input name="sample" type="text" value="Hello. And again, welcome to the aperture science computer aided enrichment center. We hope your brief detention in the relaxation vault has been a pleasant one.">
  <div><select name="voice"></select></div>
  <label>Volume: <input name="volume" type="range" min="0" max="100" value="50"></label>
  <label>Pitch: <input name="pitch" type="range" min="0" max="200" value="100"></label>
  <label>Rate: <input name="rate" type="range" min="1" max="1000" value="100"></label>
  <input name="play" type="button" value="Play it again Sam">
</form>

P.S。虽然在创建/编辑片段时功能完美，但我发布后发现结果很糟糕我怀疑这是一个浏览器问题（实验技术），因为没有记录错误，也没有代码发生变化 进一步审核，我注意到rate和pitch的可接受范围因voice而异。例如“Google US English”的播放次数不会高于rate 2，而“Microsoft Anna - English (United States)”将播放rate } 10。

Google developer docs和MDN都同意rate的范围应该从0.1到10，但MDN说明：

表示费率值的浮点数。它可以介于0.1（最低）和10（最高）之间，1是当前平台或语音的默认音高，应该与正常的语速对应。 ...

然后继续陈述：

某些语音合成引擎或语音可能会进一步限制最低和最高费率。 ...

这个相当新的API似乎有很多小问题，在完全标准化之前，需要逐个案例的解决方案。

我们可以使用JS SpeechSynthesis创建一个调制的GLaDOS声音吗？

1 个答案:

自动调谐

“欢迎来到孔径科学计算机辅助浓缩中心”

已插入更新（2019年1月6日）：

结束更新。