Question

我采用了Sphinx4 HelloWorld示例并使用“什么是病毒”或“什么是应用程序软件”等句子制作了我自己的语法文件，简单的JSGF内容，我确实将每个句子分别标记为：

public <0> =     What is number twelve | 
                 What is twelve;
public <1> =     What is the title bar;
public <2> =     What is control | 
                 What is the control key;

没有n-gram，因为我不完全理解它，我不确定它是否适用于这样一个简单的例子（或者我认为它不适用）。无论如何，代码只是来自HelloWorld.java的复制粘贴，并且识别效果非常好，我说它的准确度大约是90％。

现在我拿了那个代码并把它放在一个Runnable中，启动了一个新的线程，突然识别是可怕的，大约10％（十分之一是正确的）。

显然我在应用程序中直接使用我的麦克风（内置笔记本电脑麦克风）捕获声音，我已经看到一些建议，声音应该根据我使用的字典重新采样（这是标准的WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz），所以我的第一个问题是：内置的microphone.startRecording（）方法会不会这样做？ - 这个问题的原因是主线程上运行的HelloWorld似乎不需要重采样？

我的第二个问题是我是否正确地认为多线程会显着降低性能？如果是，有没有办法解决这个问题而不需要对代码进行大规模的改造？

我要求记录，因为我正在使用SWT和Sphinx4编写一个简单的类似Jeopardy的游戏，使用Java进行语音识别，主应用程序在主线程上运行，在另一个上运行识别。我目前使用ZipCity示例识别方式与侦听器，但即使它在主线程上运行它也是可怕的，所以我将跳到更简单的识别方式，这就是我进行HelloWorld测试的原因。

编辑：我忘了提到我通常在错误的准确性示例中得到空的结果文本

这是代码，尽管它与示例完全相同：

好工作：

public class main_class {

public static void main(String[] args) {
    ConfigurationManager cm;

    cm = new ConfigurationManager(main_class.class.getResource("/jsapi_pr/res/sapi.config.xml"));

    Recognizer recognizer = (Recognizer) cm.lookup("recognizer");
    recognizer.allocate();

    // start the microphone or exit if the program if this is not possible
    Microphone microphone = (Microphone) cm.lookup("microphone");
    if (!microphone.startRecording()) {
        System.out.println("Cannot start microphone.");
        recognizer.deallocate();
        System.exit(1);
    }

    // loop the recognition until the program exits.
    while (true) {

        System.out.println(recognizer.getState());

        Result result = recognizer.recognize();

        if (result != null) {
            String resultText = result.getBestFinalResultNoFiller();
            if(!resultText.isEmpty()) {
                System.out.println("You said: " + resultText + '\n');
            }
        } else {
            System.out.println("I can't hear what you said.\n");
        }
    }
}

糟糕工作：

public class main_class {

    public static void main(String[] args) {
        runnable_test test = new runnable_test();
        test.begin();
    }
}

public class runnable_test implements Runnable {

    @Override
    public void run() {

        ConfigurationManager cm;

        cm = new ConfigurationManager(main_class.class.getResource("/jsapi_pr/res/sapi.config.xml"));

        Recognizer recognizer = (Recognizer) cm.lookup("recognizer");
        recognizer.allocate();

        // start the microphone or exit if the program if this is not possible
        Microphone microphone = (Microphone) cm.lookup("microphone");
        if (!microphone.startRecording()) {
            System.out.println("Cannot start microphone.");
            recognizer.deallocate();
            System.exit(1);
        }

        // loop the recognition until the program exits.
        while (true) {

            System.out.println(recognizer.getState());

            Result result = recognizer.recognize();

            if (result != null) {
                String resultText = result.getBestFinalResultNoFiller();
                if(!resultText.isEmpty()) {
                    System.out.println("You said: " + resultText + '\n');
                }
            } else {
                 System.out.println("I can't hear what you said.\n");
            }
        }
    }

    public void begin() {
        Thread thread = new Thread(this);
        thread.start();
    }
}

我会尽快发布一些结果，但正如所说的第一个可以正常工作，第二个通常会触发resultText.isEmpty（），即使它“识别”某些东西通常也是错误的。

EDIT2 ：我提高了我的麦克风性能和音量，它的工作方式更好，但仍然让我很难以理解为什么会发生这种情况，因为正如我所说的那样，没有提升麦克风的结果仍然非常在主线程中运行时很好。

主要应用程序的性能也更好，从12中的2到12中的6。

多线程时Sphinx4的性能问题

0 个答案: