我使用TIMIT语料库将MFCC数据用作输入,将音素标签(具有转换为整数的音素值)用作目标输出。 我试图从可用于使用Tensorflow CTC功能的音素标签创建稀疏张量。
我收到的错误是:
InvalidArgumentError (see above for traceback): label SparseTensor is not valid: indices[1] = [1,0] is out of bounds: need 0 <= index < [1,9]
[[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](transpose_1, _recv_targets/indices_0, _recv_targets/values_0, _recv_sequence_length_0)]]
无法找到解决方案。我有任何尺寸或其他错误吗?
该程序的代码在这里: https://github.com/shardulparab97/Speech-Recog/blob/master/model2.py
def sparse_tuple_from(sequences, dtype=np.int32):
#Create a sparse representention of x.
#Args:
# sequences: a list of lists of type dtype where each element is a #sequence
#Returns:
# A tuple with (indices, values, shape)
#
indices = []
values = []
#print ("SEQUENCES IN FUNCTION:",sequences)
for n, seq in enumerate(sequences):
indices.extend(zip([n] * len(matrix(seq)), range(len(matrix(seq)))))
values.extend([seq])
indices = np.asarray(indices, dtype=np.int32)
values = np.asarray(values, dtype=dtype)
shape = np.asarray([len(sequences), np.asarray(indices).max(0)[1] + 1], dtype=np.int32)
return indices, values, shape
答案 0 :(得分:0)
我遇到了同样的问题,我更改了maxtimesteps,并且有效。所以,你应该将时间步长改为更大的数字