我下载了CMU SphinxBase(sphinxbase-5prealpha.tar.gz)和Pocket Sphinx(pocketsphinx-5prealpha.tar.gz)并安装了所有必需的软件包(sudo apt-get libtool bison python-dev autotools swig)并浏览了所有软件包步骤(http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx)。
在我的RPI上我跑了> pocketsphinx_continuous -inmic是的 我有一个USB Logitech网络摄像头,可以很好地使用Google API V2。
我说了我所知道的所有英语单词和pocketsphinx_continuous。它给了我类似下面的消息。我希望它会得到一些认可,我会开始改进它但是在零识别的情况下,我不确定如何改进。
READY....
Listening...
INFO: cmn_prior.c(131): cmn_prior_update: from < 40.00 3.00 -1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(149): cmn_prior_update: to < 34.68 -4.34 8.66 -9.45 -0.21 -2.80 2.86 1.73 6.98 5.36 4.14 0.69 1.67 >
INFO: ngram_search_fwdtree.c(1553): 961 words recognized (7/fr)
INFO: ngram_search_fwdtree.c(1555): 497161 senones evaluated (3551/fr)
INFO: ngram_search_fwdtree.c(1559): 1453632 channels searched (10383/fr), 98192 1st, 13846 last
INFO: ngram_search_fwdtree.c(1562): 2097 words for which last channels evaluated (14/fr)
INFO: ngram_search_fwdtree.c(1564): 40961 candidate words for entering last phone (292/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 11.18 CPU 7.986 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 24.17 wall 17.265 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 6 words
INFO: ngram_search_fwdflat.c(948): 696 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950): 8170 senones evaluated (58/fr)
INFO: ngram_search_fwdflat.c(952): 4239 channels searched (30/fr)
INFO: ngram_search_fwdflat.c(954): 940 words searched (6/fr)
INFO: ngram_search_fwdflat.c(957): 276 word transitions (1/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.86 CPU 0.614 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 1.77 wall 1.265 xRT
INFO: ngram_search.c(1253): lattice start node <s>.0 end node </s>.47
INFO: ngram_search.c(1279): Eliminated 2 nodes before end node
INFO: ngram_search.c(1384): Lattice has 243 nodes, 194 links
INFO: ps_lattice.c(1380): Bestpath score: -1185
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:47:138) = -75028
INFO: ps_lattice.c(1441): Joint P(O,S) = -97858 P(S|O) = -22830
INFO: ngram_search.c(875): bestpath 0.01 CPU 0.007 xRT
INFO: ngram_search.c(878): bestpath 0.02 wall 0.015 xRT
READY....
Listening...
Input overrun, read calls are too rare (non-fatal)
INFO: ngram_search.c(467): Resized score stack to 200000 entries
INFO: ngram_search_fwdtree.c(952): cand_sf[] increased to 64 entries
INFO: ngram_search.c(459): Resized backpointer table to 10000 entries
INFO: ngram_search.c(467): Resized score stack to 400000 entries
Input overrun, read calls are too rare (non-fatal)
INFO: ngram_search.c(459): Resized backpointer table to 20000 entries
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
答案 0 :(得分:1)
在Raspberry Pi上识别大词汇量的语音是不可能的,它太慢了。您在日志中看到它运行速度比实时慢17倍。
如果您仍想在设备上识别,可以将数据流式传输到服务器或配置小语法以进行识别。