Question

在尝试确定How to use Web Speech API at chromium?找到

的解决方案时

var voices = window.speechSynthesis.getVoices();

返回voices标识符的空数组。

不确定Chrome浏览器缺乏支持是否与此问题有关Not OK, Google: Chromium voice extension pulled after spying concerns？

问题：

1）是否有任何变通办法可以实现在Chrome浏览器中创建或转换文本音频的要求？

2）我们如何（开发者社区）创建一个反映常见和非常见词汇的音频文件的开源数据库;提供了适当的CORS标题？

Answer 1

有几种可能的解决方法可以提供从文本创建音频的功能;其中两个需要请求外部资源，另一个使用@masswerk的meSpeak.js。

使用Download the Audio Pronunciation of Words from Google中描述的方法，该方法无法在没有writing a shell script或执行HEAD请求的情况下预先确定哪些单词实际存在于资源中如果发生网络错误。例如，单词＆＃34; do＆＃34;在下面使用的资源中不可用。

＆＃13;

window.addEventListener("load", () => {

  const textarea = document.querySelector("textarea");

  const audio = document.createElement("audio");

  const mimecodec = "audio/webm; codecs=opus";

  audio.controls = "controls";

  document.body.appendChild(audio);

  audio.addEventListener("canplay", e => {
    audio.play();
  });

  let words = textarea.value.trim().match(/\w+/g);

  const url = "https://ssl.gstatic.com/dictionary/static/sounds/de/0/";

  const mediatype = ".mp3";

  Promise.all(
    words.map(word =>
      fetch(`https://query.yahooapis.com/v1/public/yql?q=select * from data.uri where url="${url}${word}${mediatype}"&format=json&callback=`)
      .then(response => response.json())
      .then(({query: {results: {url}}}) =>
        fetch(url).then(response => response.blob())
        .then(blob => blob)
      )
    )
  )
  .then(blobs => {
    // const a = document.createElement("a");
    audio.src = URL.createObjectURL(new Blob(blobs, {
                  type: mimecodec
                }));
    // a.download = words.join("-") + ".webm";
    // a.click()
  })
  .catch(err => console.log(err));
});

＆＃13;

<textarea>what it does my ninja?</textarea>

＆＃13;

Wikimedia Commons Category:Public domain的资源不一定在同一目录中提供，请参阅How to retrieve Wiktionary word content?，wikionary API - meaning of words。

如果知道资源的准确位置，可以请求音频，但URL可能包含除单词本身以外的前缀。

＆＃13;

fetch("https://upload.wikimedia.org/wikipedia/commons/c/c5/En-uk-hello-1.ogg")
.then(response => response.blob())
.then(blob => new Audio(URL.createObjectURL(blob)).play());

＆＃13;

不完全确定如何使用Wikipedia API，How to get Wikipedia content using Wikipedia's API?，Is there a clean wikipedia API just for retrieve content summary?来获取音频文件。需要针对以JSON结尾的文本解析.ogg响应，然后需要为资源本身做出第二次请求。

fetch("https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello")
.then(response => response.text())
.then(data => {
  new Audio(location.protocol + data.match(/\/\/upload\.wikimedia\.org\/wikipedia\/commons\/[\d-/]+[\w-]+\.ogg/).pop()).play()
})
// "//upload.wikimedia.org/wikipedia/commons/5/52/En-us-hello.ogg\"

记录

Fetch API cannot load https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello. No 'Access-Control-Allow-Origin' header is present on the requested resource

未从同一来源请求时。我们需要再次尝试使用YQL，但不确定如何制定查询以避免错误。

第三种方法使用稍微修改后的meSpeak.js版本来生成音频而不进行外部请求。修改是为.loadConfig()方法

创建一个正确的回调

＆＃13;

fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
  .then(response => response.text())
  .then(text => {
    const script = document.createElement("script");
    script.textContent = text;
    document.body.appendChild(script);

    return Promise.all([
      new Promise(resolve => {
        meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
      }),
      new Promise(resolve => {
        meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
      })
    ])
  })
  .then(() => {
    // takes approximately 14 seconds to get here
    console.log(meSpeak.isConfigLoaded());
    meSpeak.speak("what it do my ninja", {
      amplitude: 100,
      pitch: 5,
      speed: 150,
      wordgap: 1,
      variant: "m7"
    });
})
.catch(err => console.log(err));

＆＃13;

上述方法的一个警告是，在播放音频之前加载三个文件需要大约14秒半。但是，请避免外部请求。

对于其中一个或两个都是积极的1）创建一个FOSS，开发人员维护的数据库或声音目录，用于常见和不常见的单词; 2）进一步开发meSpeak.js以减少三个必要文件的加载时间;并使用基于Promise的方法提供有关文件加载进度和应用程序准备情况的通知。

在此用户中＆＃39;估计，如果开发人员自己创建并贡献给使用特定单词的音频文件响应的文件的在线数据库，那么它将是一个有用的资源。不完全确定github是否适合托管音频文件？如果对这样的项目感兴趣，将不得不考虑可能的选择。

如何在Chrome浏览器中创建文本或将文本转换为音频？

1 个答案: