
时间:2015-02-22 19:18:14

标签: python speech-recognition speech-to-text cmusphinx pocketsphinx

我正在尝试使用cmu pocketsphinx编写一个简单的语音识别器,但遇到 decode_raw()函数时总会崩溃。 我在Windows 7 64位上使用Python 2.7 32位和PyPocketSphinx(用pip安装)


from pocketsphinx import Decoder
import sphinxbase




decoder.decode_raw(fh) #Crash
print decoder.get_hyp()


修改 崩溃前生成的日志:

INFO: cmd_ln.c(696): Parsing command line:

Current configuration:
[NAME]          [DEFLT]         [VALUE]
-agc            none            none
-agcthresh      2.0             2.000000e+000
-allphone_ci    no              no
-alpha          0.97            9.700000e-001
-ascale         20.0            2.000000e+001
-aw             1               1
-backtrace      no              no
-beam           1e-48           1.000000e-048
-bestpath       yes             yes
-bestpathlw     9.5             9.500000e+000
-bghist         no              no
-ceplen         13              13
-cmn            current         current
-cmninit        8.0             8.0
-compallsen     no              no
-debug                          0
-dictcase       no              no
-dither         no              no
-doublebw       no              no
-ds             1               1
-feat           1s_c_d_dd       1s_c_d_dd
-fillprob       1e-8            1.000000e-008
-frate          100             100
-fsgusealtpron  yes             yes
-fsgusefiller   yes             yes
-fwdflat        yes             yes
-fwdflatbeam    1e-64           1.000000e-064
-fwdflatefwid   4               4
-fwdflatlw      8.5             8.500000e+000
-fwdflatsfwin   25              25
-fwdflatwbeam   7e-29           7.000000e-029
-fwdtree        yes             yes
-input_endian   little          little
-kdmaxbbi       -1              -1
-kdmaxdepth     0               0
-kws_plp        1e-1            1.000000e-001
-kws_threshold  1               1.000000e+000
-latsize        5000            5000
-ldadim         0               0
-lextreedump    0               0
-lifter         0               0
-lmname         default         default
-logbase        1.0001          1.000100e+000
-logspec        no              no
-lowerf         133.33334       1.333333e+002
-lpbeam         1e-40           1.000000e-040
-lponlybeam     7e-29           7.000000e-029
-lw             6.5             6.500000e+000
-maxhmmpf       10000           10000
-maxnewoov      20              20
-maxwpf         -1              -1
-min_endfr      0               0
-mixwfloor      0.0000001       1.000000e-007
-mmap           yes             yes
-ncep           13              13
-nfft           512             512
-nfilt          40              40
-nwpen          1.0             1.000000e+000
-pbeam          1e-48           1.000000e-048
-pip            1.0             1.000000e+000
-pl_beam        1e-10           1.000000e-010
-pl_pbeam       1e-5            1.000000e-005
-pl_window      0               0
-remove_dc      no              no
-remove_noise   yes             yes
-remove_silence yes             yes
-round_filters  yes             yes
-samprate       16000           1.600000e+004
-seed           -1              -1
-silprob        0.005           5.000000e-003
-smoothspec     no              no
-tmatfloor      0.0001          1.000000e-004
-topn           4               4
-topn_beam      0               0
-transform      legacy          legacy
-unit_area      yes             yes
-upperf         6855.4976       6.855498e+003
-usewdphones    no              no
-uw             1.0             1.000000e+000
-vad_postspeech 50              50
-vad_prespeech  10              10
-vad_threshold  2.0             2.000000e+000
-varfloor       0.0001          1.000000e-004
-varnorm        no              no
-verbose        no              no
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29           7.000000e-029
-wip            0.65            6.500000e-001
-wlen           0.025625        2.562500e-002

INFO: cmd_ln.c(696): Parsing command line:
        -lowerf 130 \
        -upperf 6800 \
        -nfilt 25 \
        -transform dct \
        -lifter 22 \
        -feat 1s_c_d_dd \
        -svspec 0-12/13-25/26-38 \
        -agc none \
        -cmn current \
        -varnorm no \
        -model ptm \
        -cmninit 40,3,-1

Current configuration:
[NAME]          [DEFLT]         [VALUE]
-agc            none            none
-agcthresh      2.0             2.000000e+000
-alpha          0.97            9.700000e-001
-ceplen         13              13
-cmn            current         current
-cmninit        8.0             40,3,-1
-dither         no              no
-doublebw       no              no
-feat           1s_c_d_dd       1s_c_d_dd
-frate          100             100
-input_endian   little          little
-ldadim         0               0
-lifter         0               22
-logspec        no              no
-lowerf         133.33334       1.300000e+002
-ncep           13              13
-nfft           512             512
-nfilt          40              25
-remove_dc      no              no
-remove_noise   yes             yes
-remove_silence yes             yes
-round_filters  yes             yes
-samprate       16000           1.600000e+004
-seed           -1              -1
-smoothspec     no              no
-svspec                         0-12/13-25/26-38
-transform      legacy          dct
-unit_area      yes             yes
-upperf         6855.4976       6.800000e+003
-vad_postspeech 50              50
-vad_prespeech  10              10
-vad_threshold  2.0             2.000000e+000
-varnorm        no              no
-verbose        no              no
-warp_type      inverse_linear  inverse_linear
-wlen           0.025625        2.562500e-002

INFO: acmod.c(252): Parsed model-specific feature parameters from pocketsphinx-5
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13,
CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: pocketsphinx-5prealpha-win32/model/
INFO: mdef.c(530): Found byte-order mark BMDF, assuming this is a binary mdef fi
INFO: bin_mdef.c(336): Reading binary model definition: pocketsphinx-5prealpha-w
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-s
en, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: pocketsphinx-5pr
INFO: acmod.c(124): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: pocketsphinx-5prealp
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: pocketsphinx-5prealp
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: acmod.c(126): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: pocketsphinx-5prealp
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: pocketsphinx-5prealp
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(467): Loading senones from dump file pocketsphinx-5prealpha-win
INFO: ptm_mgau.c(554): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(586): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(826): Maximum top-N: 4
INFO: dict.c(320): Allocating 137526 * 20 bytes (2686 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: pocketsphinx-5prealpha-win32/model/e
INFO: dict.c(213): Allocated 1007 KiB for strings, 1662 KiB for phones
INFO: dict.c(336): 133425 words read
INFO: dict.c(342): Reading filler dictionary: pocketsphinx-5prealpha-win32/model
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(345): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial trip
INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word trip
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=19794, 2=1377200, 3=3178194
INFO: ngram_model_dmp.c(242):    19794 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(288):  1377200 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(314):  3178194 = LM.trigrams read
INFO: ngram_model_dmp.c(339):    57155 = LM.prob2 entries read
INFO: ngram_model_dmp.c(359):    10935 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(379):    34843 = LM.prob3 entries read
INFO: ngram_model_dmp.c(407):     2690 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(463):    19794 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 56 single-phone
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 56 singl
e-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 44782
INFO: ngram_search_fwdtree.c(339): after: 573 root, 44654 non-root channels, 47
single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25

编辑2: 当我尝试编译PyPocketsphinx时,这是在日志文件中生成的:

setup.py build:

running build
running build_ext
running build_py
copying sphinxbase\swig\python\sphinxbase.py -> build\lib.win32-2.7\sphinxbase
copying pocketsphinx\swig\python\pocketsphinx.py -> build\lib.win32-2.7\pocketsp


running install
running build_ext
running build
running build_py
copying sphinxbase\swig\python\sphinxbase.py -> build\lib.win32-2.7\sphinxbase
copying pocketsphinx\swig\python\pocketsphinx.py -> build\lib.win32-2.7\pocketsp
running install_lib
copying build\lib.win32-2.7\pocketsphinx\_pocketsphinx.pyd -> C:\Python27\Lib\si
copying build\lib.win32-2.7\sphinxbase\_sphinxbase.pyd -> C:\Python27\Lib\site-p
running install_egg_info
running egg_info
writing PyPocketSphinx.egg-info\PKG-INFO
writing top-level names to PyPocketSphinx.egg-info\top_level.txt
writing dependency_links to PyPocketSphinx.egg-info\dependency_links.txt
reading manifest file 'PyPocketSphinx.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'build'
no previously-included directories found matching 'dist'
warning: no previously-included files found matching 'sphinxbase\swig\sphinxbase
warning: no previously-included files found matching 'pocketsphinx\swig\pocketsp
writing manifest file 'PyPocketSphinx.egg-info\SOURCES.txt'
removing 'C:\Python27\Lib\site-packages\PyPocketSphinx-12608.5-py2.7.egg-info' (
and everything under it)
Copying PyPocketSphinx.egg-info to C:\Python27\Lib\site-packages\PyPocketSphinx-
running install_scripts


warning: no previously-included files found matching 'sphinxbase\swig\sphinxbase_wrap.c'
warning: no previously-included files found matching 'pocketsphinx\swig\pocketsphinx_wrap.c'

0 个答案:
