Question

我正在尝试构建基于音频的文本编辑界面，并且当前正在使用PocketSphinx尝试构建项目的v1。（我意识到还有其他解决方案可以处理此任务，并且可能会在V2上尝试其他解决方案。）我基于Pocketsphinx材料随附的演示应用程序在这里构建：https://github.com/cmusphinx/pocketsphinx-android-demo

我能够使用关键字搜索模式并识别我的命令：“创建文件”，该命令创建一个空白文档。打开新文件后，我希望识别器然后切换到GrammarSearch模式，并使用我输入的小单词列表听5秒钟并更新标题，然后听10秒钟并更新文档的正文。

到目前为止，我一直在尝试修改演示应用程序中提供的“ PocketSphinxActivity”，但效果有限。这是我当前的识别器设置：

private void setupRecognizer(java.io.File assetsDir) throws IOException {
        // The recognizer can be configured to perform multiple searches
        // of different kind and switch between them

        recognizer = SpeechRecognizerSetup.defaultSetup()
                .setAcousticModel(new java.io.File(assetsDir, "en-us-ptm"))
                .setDictionary(new java.io.File(assetsDir, "cmudict-en-us.dict"))

                //.setRawLogDir(assetsDir) // To disable logging of raw audio comment out this call (takes a lot of space on the device)

                .getRecognizer();
        recognizer.addListener(this);

        /* In your application you might not need to add all those searches.
          They are added here for demonstration. You can leave just one.
         */

        //Create keyword-activation search.
        recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);

        //Create grammar-based search for selection between demos
        java.io.File menuGrammar = new java.io.File(assetsDir, "drivemenu.gram");
        recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);

        java.io.File menuCreate = new java.io.File(assetsDir, "presentation_commands.gram");
        recognizer.addGrammarSearch(CREATE_DOC, menuCreate);
    }

其中的演示命令是：

#JSGF V1.0;

grammar presentation_commands;

<presentation_command> = welcome |
                          to |
                          my |
                          presentation |
                          everybody |
                          november |
                          demonstration |
                          file;

public <presentation_commands> = <presentation_command>+;

（我将使用一些测试单词来填充我的简单演示文件。）

我可以说：“创建文件”，将创建我的新文件，但是在这一点上，我希望以下人员满意，但无法实现：听3秒钟，并以给定的文本作为文件名，填充我的文件Title EditText元素。收听10秒钟，然后使用给定的文本作为我的文件Body EditText元素的文本。

这是我目前尝试做的事情：

@Override
    public void onPartialResult(Hypothesis hypothesis) {
        if (hypothesis == null)
            return;

        String text = hypothesis.getHypstr();
        //Toast.makeText(this, text, Toast.LENGTH_SHORT).show();

        if (text.equals(KEYPHRASE)) {
            //Toast.makeText(this, "Hit Keyphrase in if", Toast.LENGTH_SHORT).show();
            recognizer.stop();
            recognizer.startListening(MENU_SEARCH);
        } else if (text.equals(CREATE_FILE)) {
            recognizer.stop();
            createFile();
            populateFileName();
            Toast.makeText(this, "Created File", Toast.LENGTH_SHORT).show();
        } 
//        } else if (file_name_bool){
//            fillTitle(text);
//        } else if (file_body_bool) {
//            fillBody(text);
//        }
        else {
            ((TextView) findViewById(R.id.update_text)).setText(text);
        }
    }

我能够可靠地命中“ text.equals（CREATE_FILE）”块，而我的createFile（）函数仅通过Google Drive API不能处理识别器。我的目标是通过populateFileName（）函数启动侦听和填充的链：

public void populateFileName() {
        file_name_bool = true;
        recognizer.startListening(CREATE_DOC, 5000);
    }

    public void fillTitle(String text) {
        recognizer.stop();
        file_name_bool = false;
        file_body_bool = true;
        mFileTitleEditText.setText(text);
        recognizer.startListening(CREATE_DOC, 10000);
    }

    public void fillBody(String text) {
        recognizer.stop();
        file_body_bool = false;
        mDocContentEditText.setText(text);
        recognizer.startListening(KWS_SEARCH);
    }

我最初的想法是，一旦我切换到CREATE_DOC模式，该应用程序将只等它到达我的onResult函数：

@Override
    public void onResult(Hypothesis hypothesis) {
        Toast.makeText(this, "in onResult", Toast.LENGTH_SHORT).show();
        ((TextView) findViewById(R.id.update_text)).setText("");
        if (hypothesis != null) {
            String text = hypothesis.getHypstr();

            if (file_name_bool)
                fillTitle(text);
            else if (file_body_bool)
                fillBody(text);
            else
                makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();
        }
    }

但是，我似乎没有找到onResult（）函数。相反，我得到的行为是onPartialResult（）的else块，该块只是用我的语音更新了editText。（一个好消息是，正在更新的语音正在从我的gram文件中提取正确的词汇，所以这是一个小小的胜利）。不幸的是，我永远无法真正更新文件标题或文件正文，而且我不确定为什么。

当我尝试取消注释onPartialResult（）的块以检查布尔值时，我可以更新文件名和文件正文，但是，由于识别器立即停止，因此我只能听到一个字。因此，我想找到一种不使用onPartialResult的方法，而只使用onResult，以便它可以添加完整的语音输入。

如果这是一个基本问题，请多谢并深表歉意。我也一直在浏览this tutorial，但无法成功调整它。

使用PocketSphinx进行连续识别

0 个答案: