Chrome扩展程序:通过语音识别通话访问麦克风

时间:2019-06-18 16:06:12

标签: javascript google-chrome-extension speech-recognition

在阅读本文之前,它可能与 How can a Chrome extension get a user's permission to use user's computer's microphone? 如果有帮助,我在下面添加了一个答案,包括代码和清单。

我正在编写一个最小的Chrome扩展程序(在MacOS 10.14.5上使用Chrome 75.0.3770.90)来为我的辅助功能项目实现“监听”按钮。我已经用能使麦克风正常工作的JavaScript编写了HTML版本。

但是,当我将代码提升到Extension background.js文件中时,文本到语音有效,而语音到文本无效。代码会运行,但是闪烁的麦克风永远不会出现在选项卡中。

有效的代码是:


    <!DOCTYPE html>
    <html>
        <body>
        <h2>All-in-one JavaScript Example</h2>
        <button onclick="myCode();">Listen</button> 
        <script>
            window.SpeechRecognition = window.webkitSpeechRecognition 
               || window.SpeechRecognition;

            function myCode() {
                recognition = new SpeechRecognition();
                recognition.start();
                recognition.onresult = function(event) {    
                if (event.results[0].isFinal) {
                    response = event.results[0][0].transcript;
                    synth = window.speechSynthesis;
                    synth.speak( new SpeechSynthesisUtterance( 
                        "i don't understand, "+response
                    ));
            }   }
            alert( "all-in-one: we're done!" );
        }
        </script>
        </body>
    </html>

最小的可复制示例:


    {
        "name": "myName",
        "description": "Press to talk",
        "version": "0.97",
        "manifest_version": 2,
        "background": {
            "scripts": ["background.js"],
            "persistent": false
        },
        "permissions": ["contentSettings","desktopCapture","*://*/*","tabCapture","tabs","tts","ttsEngine"],
        "browser_action": {
            "default_icon": "images/myIcon512.png",
            "default_title": "Press Ctrl(Win)/Command(Mac)+Shift+ Down to speak"
        },
        "commands": {
            "myCommand": {
                "suggested_key": {
                    "default": "Ctrl+Shift+Down",
                    "mac": "Command+Shift+Down"
                },
                "description": "Start speaking"
            }
        },
        "icons": {
            "512": "images/myIcon512.png"
        }
    }

我的背景JavaScript是:

    window.SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition;

    function myCode() {
        var recognition = new SpeechRecognition();
        recognition.onresult = function(event) {
            if (event.results[0].isFinal) {
                var synth = window.speechSynthesis;
                synth.speak( new SpeechSynthesisUtterance(
                        "sorry, I don't understand."
                    )
                );
            }   
        }
        recognition.start();
        alert( "extension: we're done!" );
    }
    chrome.commands.onCommand.addListener(function(command) {
        if (command === 'myCommand')
            myCode();
    });

我还注意到该代码仅运行一次-我可以继续单击“监听”按钮,但是Extension命令仅运行一次(在函数开始处发出警报仅在第一次出现时显示) )

我的浏览器的默认设置是(一次)询问HTML版本。

感谢您阅读本文!我在下面用代码给出了答案。

1 个答案:

答案 0 :(得分:0)

我遇到的问题是麦克风似乎是一项后台任务,而我正在尝试与选项卡内容进行交互。我认为这并不常见,并且在清单中(全部)以通用的“匹配”值(*://*/*)结尾:

{ "name": "Enguage(TM) - Let's all talk to the Web",
  "short_name" : "Enguage",
  "description": "A vocal Web interface",
  "version": "0.98",
  "manifest_version": 2,
  "content_security_policy": "script-src 'self'; object-src 'self'",
  "background": {
    "scripts": ["kbTx.js"],
    "persistent": false
  },
  "content_scripts": [
    { "matches" : ["*://*/*"],
      "js": ["tabRx.js", "interp.js"]
  } ],
  "permissions": [
    "activeTab",
    "contentSettings",
    "desktopCapture",
    "tabCapture",
    "tabs",
    "tts"
  ],
  "browser_action": {
    "default_icon": "images/lbolt512.png",
    "default_title": "Press Ctrl(Win)/Command(Mac)+Shift+ Space and speak"
  },
  "commands": {
    "enguage": {
      "suggested_key": {
        "default": "Ctrl+Shift+Space",
        "mac": "Command+Shift+Space"
      },
      "description": "single utterance"
  } },
  "icons": {
    "16": "images/lbolt16.png",
    "48": "images/lbolt48.png",
    "128": "images/lbolt128.png",
    "512": "images/lbolt512.png"
} }

我认为Google可能不喜欢这样!无论如何,我已经在背景代码(kbTx.js)中加入了键盘监听器:


    chrome.commands.onCommand.addListener(function(command) {
        if (command === 'enguage') {
        // find the active tab...
        chrome.tabs.query({active: true, currentWindow: true}, function(tabs) {
            //send it a message...
            chrome.tabs.sendMessage(
                tabs[0].id,          // index not always 0?
                null, // message sent - none required?
                null                 // response callback - none expected!
                //function(response) {console.log("done" /*response.farewell*/);}
                );
            });
        }
    });

我已经放入一个上下文脚本来侦听此消息(tabRx.js):


    window.SpeechRecognition = window.webkitSpeechRecognition || 
                                window.SpeechRecognition;

    chrome.runtime.onMessage.addListener(
        function(request, sender, sendResponse) {
            var recognition = new SpeechRecognition();
            recognition.start();
            recognition.continuous = false;
            recognition.onresult = function(event) {
                if (event.results[0].isFinal) {
                    window.speechSynthesis.speak(
                        new SpeechSynthesisUtterance(
                            interp( event.results[0][0].transcript )
                    )   );
        }   }   }
    );

消息侦听器实质上包含上面allInOne.html示例中的代码。 可能还有其他方式可以做到这一点,但这确实可行,而且似乎很轻巧。 希望这会有所帮助。

如果您认为我可以改进我的代码,请随时添加注释!