python中的Pocketsphinx返回关键字搜索

时间:2017-11-18 08:29:54

标签: python pocketsphinx

我从网站上复制了一个代码,使用pocketsphinx来监听python中的特定单词。虽然运行但从未按预期输出关键字。这是我的代码:

import sys, os
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *
import pyaudio

# modeldir = "../../../model"
# datadir = "../../../test/data"

modeldir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us"
dictdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//cmudict-en-us.dict"
lmdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us.lm.bin"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', modeldir)
config.set_string('-lm', lmdir )
config.set_string('-dict', dictdir)
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
      #print(decoder.hyp().hypstr)
      if decoder.hyp().hypstr == 'forward':
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyword, restarting search")
        decoder.end_utt()
        decoder.start_utt()

当我使用print(decoder.hyp().hypstr)

当我说任何话时它只输出随机单词。如果我说一个单词或一行它输出:

the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the da
the head
the bed
the bedding
the heading of
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and well
the bedding and well
the bedding and well
the bedding and butler
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus ha
the bedding and blessed are
the bedding and blessed are
the bedding and what lingus on
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want or
the bedding and what lingus want to talk
the bedding and what lingus current top
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her

请帮助我完成它。我只是python的新手。

2 个答案:

答案 0 :(得分:1)

首先,我只想澄清一下;你的Pocketsphinx 正在工作。

因此,根据我使用pocketsphinx的经验,您很难使用most accurate语音识别工具,但对于离线解决方案而言,这可能是您最好的选择。 Pocketsphinx只能将您的单词(音频)翻译为最佳单词(音频)。 model规定。这些模型似乎仍在进行中,其中很多都需要改进。您可以采取一些措施来提高识别的准确性;例如reducing noisetuning the recognition,但这不在本问题的直接范围内。

根据我在您的代码中所理解的,您正在寻找一个特定的关键词(由用户口头说),并使用pocketshinx的后端进行识别。这个关键字似乎是"转发"。您可以进一步了解如何正确完成"hot word listening"

你有正确的想法,但可以改进方法。这是我的快速修复"您的代码版本:

import os
import pyaudio
import pocketsphinx as ps

modeldir = "C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//"

# Create a decoder with certain model
config = ps.Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us'))
config.set_string('-lm', os.path.join(modeldir, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(modeldir, 'cmudict-en-us.dict'))
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = ps.Decoder(config)
decoder.start_utt()

while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
    if decoder.hyp() is not None:
        print(decoder.hyp().hypstr)
        if 'forward' in decoder.hyp().hypstr:
            print([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
            print("Detected keyword, restarting search")
            decoder.end_utt()
            decoder.start_utt()

任何一个pocketsphinx.Decoder()"会话" (即调用.start_utt()方法,而不随后调用.ent_utt()),decoder.hyp().hypstr变量一旦检测到输入音频流有一个"就会有效地继续向其自身添加单词。有效"翻译/识别来自pocketsphinx的解码。

您已使用if decoder.hyp().hypstr == 'forward':。这样做,它强制整个字符串正好"前进"代码输入(我假设,期望......是吗?)条件代码块。由于pocketshinx默认情况下不是很准确,因此通常需要对大多数单词进行几次尝试才能使其实际注册正确的单词。出于这个原因,并且由于decoder.hyp().hypstr自我添加(如前所述),我使用了行if 'forward' in decoder.hyp().hypstr:。这会查找所需的关键字" forward"在整个字符串中。这样,它可以在找到关键字之前进行错误识别。

我希望它有所帮助!

答案 1 :(得分:0)

您需要删除此行

  config.set_string('-lm', lmdir )

关键词搜索和lm搜索是互斥的。