我目前正在尝试训练这个RNN网络,但似乎遇到了奇怪的错误,我无法解码。
我的网络输入是数字采样音频文件。由于音频文件可以具有不同的长度,因此采样音频的矢量也具有不同的长度。
神经网络的输出或目标是重新创建包含音频文件的某些信息的14维向量。我已经通过手动计算它已经知道了目标,但需要让它与神经网络一起工作。
我目前正在使用tensorflow作为框架。
我的网络设置如下:
def last_relevant(output):
max_length = int(output.get_shape()[1])
relevant = tf.reduce_sum(tf.mul(output, tf.expand_dims(tf.one_hot(length, max_length), -1)), 1)
return relevant
def length(sequence): ##Zero padding to fit the max lenght... Question whether that is a good idea.
used = tf.sign(tf.reduce_max(tf.abs(sequence), reduction_indices=2))
length = tf.reduce_sum(used, reduction_indices=1)
length = tf.cast(length, tf.int32)
return length
def cost(output, target):
# Compute cross entropy for each frame.
cross_entropy = target * tf.log(output)
cross_entropy = -tf.reduce_sum(cross_entropy, reduction_indices=2)
mask = tf.sign(tf.reduce_max(tf.abs(target), reduction_indices=2))
cross_entropy *= mask
# Average over actual sequence lengths.
cross_entropy = tf.reduce_sum(cross_entropy, reduction_indices=1)
cross_entropy /= tf.reduce_sum(mask, reduction_indices=1)
return tf.reduce_mean(cross_entropy)
#----------------------------------------------------------------------#
#----------------------------Main--------------------------------------#
### Tensorflow neural network setup
batch_size = None
sequence_length_max = max_length
input_dimension=1
data = tf.placeholder(tf.float32,[batch_size,sequence_length_max,input_dimension])
target = tf.placeholder(tf.float32,[None,14])
num_hidden = 24 ## Hidden layer
cell = tf.nn.rnn_cell.LSTMCell(num_hidden,state_is_tuple=True) ## Long short term memory
output, state = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32,sequence_length = length(data)) ## Creates the Rnn skeleton
last = last_relevant(output)#tf.gather(val, int(val.get_shape()[0]) - 1) ## Appedning as last
weight = tf.Variable(tf.truncated_normal([num_hidden, int(target.get_shape()[1])]))
bias = tf.Variable(tf.constant(0.1, shape=[target.get_shape()[1]]))
prediction = tf.nn.softmax(tf.matmul(last, weight) + bias)
cross_entropy = cost(output,target)# How far am I from correct value?
optimizer = tf.train.AdamOptimizer() ## TensorflowOptimizer
minimize = optimizer.minimize(cross_entropy)
mistakes = tf.not_equal(tf.argmax(target, 1), tf.argmax(prediction, 1))
error = tf.reduce_mean(tf.cast(mistakes, tf.float32))
## Training ##
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
batch_size = 1000
no_of_batches = int(len(train_data)/batch_size)
epoch = 5000
for i in range(epoch):
ptr = 0
for j in range(no_of_batches):
inp, out = train_data[ptr:ptr+batch_size], train_output[ptr:ptr+batch_size]
ptr+=batch_size
sess.run(minimize,{data: inp, target: out})
print "Epoch - ",str(i)
incorrect = sess.run(error,{data: test_data, target: test_output})
print('Epoch {:2d} error {:3.1f}%'.format(i + 1, 100 * incorrect))
sess.close()
错误似乎是函数last_relevant的用法,它应该接受输出并将其反馈。
这是错误消息:
TypeError: Expected binary or unicode string, got <function length at 0x7f846594dde8>
无论如何要说出这里可能出现的问题?
答案 0 :(得分:1)
我试图在我的本地构建你的代码。 代码中存在一个根本性的错误,即您调用tf.one_hot,但是您传递的内容并不符合预期:
tf.one_hot(indices, depth, on_value=None, off_value=None, axis=None, dtype=None, name=None)
但是,您传递的是函数指针(&#34;长度&#34;是代码中的函数,我建议通过避免使用常用关键字以有意义的方式命名函数)而不是第一个参数。
对于狂野指南,您可以将索引作为第一个参数(而不是我的占位符空列表)并将其修复
relevant = tf.reduce_sum(
tf.mul(output, tf.expand_dims(tf.one_hot([], max_length), -1)), 1)