我在dl4mt(开源神经机器翻译工具)中编写代码 我在theano扫描中遇到了一些奇怪的问题。 以下代码用于根据一些索引从3D张量中提取子张量。
" indicesSub"第1098行的形状(n_sample,window) " SL"在1099年是(n_sample,) " CC _"第1100行的形状(n_timestep,n_sample,dimctx), cc_result的结果应该是形状(nsample,window,dimctx)
所以在第1116行中,cc_已经被调整为nimshuffled to shape(n_sample,n_timestep,dimctx),这三个输入都是序列。
在内循环(_sub_step)中,索引用作序列参数,另外两个是非序列。训练时会出现错误,我最后会提出这个错误。但是,当我使用一些技巧来训练伪模型并使用它进行测试时,错误并没有显示......这真的很奇怪,因为训练和测试共享相同的代码。
错误也很奇怪。它表示输入形状分别为(17,256),(256,),()。但根据我的代码,输入形状应该是(17,256),(21,),(),其中21是窗口大小。
我想知道是否有可能改变扫描输入的东西?也许是无输入?我不确定。如果您有任何想法,请帮助我。谢谢
1098 indicesSub = indices_mask_[src_positions] # n_samples, window
1099 sl = sntlens.reshape([sntlens.shape[0], ]) # (n_sample, )
1100 ccshuffle = cc_.dimshuffle(1, 0, 2) # n_sample, n_timestep, dimctx
1101
1102 def _step_index(indices, cc_sub, sntlen_in):
1103 def _sub_step(indice_step, cc_step, len_step):
1104 # indice_step is a scalar
1105 # cc_step is (ntimestep * dimctx)
1106 # sntlen_in is a scalar
1107 r = tensor.switch(tensor.lt(indice_step, 0), 0, 1)
1108 l = tensor.switch(tensor.ge(indice_step, len_step), 0, 1)
1109 rt = ifelse(tensor.lt(r * l, 1), tensor.zeros([cc_step.shape[1], ]), cc_step[tensor.cast(indice_step, 'int64')])
1110 return rt
1111 ret, updt = theano.scan(_sub_step,
1112 sequences=indices,
1113 non_sequences=[cc_sub, sntlen_in])
1114 return ret
1115
1116 cc_result, upd = theano.scan(_step_index,
1117 sequences=[indicesSub, ccshuffle, sl])
1118
错误信息:
24 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
25 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
26 r = p(n, [x[0] for x in i], o)
27 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
28 self, node)
29 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
30 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
31 reraise(exc_type, exc_value, exc_trace)
32 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
33 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 951, in rval
34 r = p(n, [x[0] for x in i], o)
35 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/scan_module/scan_op.py", line 940, in <lambda>
36 self, node)
37 File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4316)
38 File "/search/speech/chengshanbo/tools/anaconda2/lib/python2.7/site-packages/Theano-0.8.0-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
39 reraise(exc_type, exc_value, exc_trace)
40 File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/search/odin/chengshanbo/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-redhat-6.6-Santiago-x86_64-2.7. 11-64/scan_perform/mod.cpp:4193)
41 IndexError: index out of bounds
42 Apply node that caused the error: GpuIncSubtensor{Inc;int64}(GpuElemwise{add,no_inplace}.0, if{inplace,gpu}.0, ScalarFromTensor.0)
43 Toposort index: 5
44 Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, vector), Scalar(int64)]
45 Inputs shapes: [(17, 256), (256,), ()]
46 Inputs strides: [(256, 1), (1,), ()]
47 Inputs values: ['not shown', 'not shown', 17]
48 Outputs clients: [['output']]
49