Question

我有一个.wav声音文件我想在搜索时将其内容转换为文本文件我发现了这个link

使用此命令行进行转换

pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > file.txt

在我的.wav文件上使用后，它会生成2个文件，一个是pocketphinx.log，其中包含

    INFO: cmd_ln.c(691): Parsing command line:
pocketsphinx_continuous \
    -infile file.wav 

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-adcdev             
-agc        none        none
-agcthresh  2.0     2.000000e+000
-alpha      0.97        9.700000e-001
-argfile            
-ascale     20.0        2.000000e+001
-aw     1       1
-backtrace  no      no
-beam       1e-48       1.000000e-048
-bestpath   yes     yes
-bestpathlw 9.5     9.500000e+000
-bghist     no      no
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-compallsen no      no
-debug              0
-dict               
-dictcase   no      no
-dither     no      no
-doublebw   no      no
-ds     1       1
-fdict              
-feat       1s_c_d_dd   1s_c_d_dd
-featparams         
-fillprob   1e-8        1.000000e-008
-frate      100     100
-fsg                
-fsgusealtpron  yes     yes
-fsgusefiller   yes     yes
-fwdflat    yes     yes
-fwdflatbeam    1e-64       1.000000e-064
-fwdflatefwid   4       4
-fwdflatlw  8.5     8.500000e+000
-fwdflatsfwin   25      25
-fwdflatwbeam   7e-29       7.000000e-029
-fwdtree    yes     yes
-hmm                
-infile             file.wav
-input_endian   little      little
-jsgf               
-kdmaxbbi   -1      -1
-kdmaxdepth 0       0
-kdtree             
-latsize    5000        5000
-lda                
-ldadim     0       0
-lextreedump    0       0
-lifter     0       0
-lm             
-lmctl              
-lmname     default     default
-logbase    1.0001      1.000100e+000
-logfn              
-logspec    no      no
-lowerf     133.33334   1.333333e+002
-lpbeam     1e-40       1.000000e-040
-lponlybeam 7e-29       7.000000e-029
-lw     6.5     6.500000e+000
-maxhmmpf   -1      -1
-maxnewoov  20      20
-maxwpf     -1      -1
-mdef               
-mean               
-mfclogdir          
-min_endfr  0       0
-mixw               
-mixwfloor  0.0000001   1.000000e-007
-mllr               
-mmap       yes     yes
-ncep       13      13
-nfft       512     512
-nfilt      40      40
-nwpen      1.0     1.000000e+000
-pbeam      1e-48       1.000000e-048
-pip        1.0     1.000000e+000
-pl_beam    1e-10       1.000000e-010
-pl_pbeam   1e-5        1.000000e-005
-pl_window  0       0
-rawlogdir          
-remove_dc  no      no
-round_filters  yes     yes
-samprate   16000       1.600000e+004
-seed       -1      -1
-sendump            
-senlogdir          
-senmgau            
-silprob    0.005       5.000000e-003
-smoothspec no      no
-svspec             
-time       no      no
-tmat               
-tmatfloor  0.0001      1.000000e-004
-topn       4       4
-topn_beam  0       0
-toprule            
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   6.855498e+003
-usewdphones    no      no
-uw     1.0     1.000000e+000
-var                
-varfloor   0.0001      1.000000e-004
-varnorm    no      no
-verbose    no      no
-warp_params            
-warp_type  inverse_linear  inverse_linear
-wbeam      7e-29       7.000000e-029
-wip        0.65        6.500000e-001
-wlen       0.025625    2.562500e-002

INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
ERROR: "acmod.c", line 85: Acoustic model definition is not specified neither with -mdef option nor with -hmm

当我使用此命令时

pocketsphinx_continuous -infile myfile.wav

提到相同的问题here它在控制台上提供相同的文件输出，但我没有找到包含已识别单词的文件！
和另一个名称file.txt，它不包含任何东西
所以我做错了使文件没有转换成文本文件或我想念的东西？
提前谢谢
更新
我用这个命令

pocketsphinx_continuous.exe -infile a.wav -hmm "modeldir\model\en-us\en-us" -lm "modedir\model\en-us\en-us.lm.bin" -dict "modeldir\model\en-us\cmudict-en-us.dict"

但是我收到了这个错误

FATAL: "continuous.c", line 158: Failed to open file 'a.wav' for reading: No such file or directory

我的操作系统是Windows 10 使用口袋狮身人面像的最后一个版本

Answer 1

您需要安装pocketsphinx模型。

如果您在Linux上从源代码编译，请确保运行make install

如果要从Ununtu上的软件包进行安装，请确保已安装模型软件包，例如pocketsphinx-en-us。

如果您使用的是Windows，则必须按照教程on pocketsphinx中的说明在命令行中指定模型的实际路径。

声学模型定义既未指定-mdef选项也未指定-hmm

1 个答案: