Question

我目前正在使用tensorflow，尝试为回归目的实现神经网络。回归包括将某个输入映射到输出。我的案例中的输入采样和帧音频文件，以及必须映射到的输出是每个帧应对应的一组MFCC特征。

输入当前存储如下。

#One audio set
[array([[frame],[frame],...,[frame]],dtype=float32)]

输出目前存储如下

[array([[Feature1,  Feature2,  Feature3,
         Feature4,  Feature5,  Feature6,
         Feature7,  Feature8,  Feature9,
         Feature10,   Feature11,   Feature12,
         Feature13],....,[...]])]

我目前正在尝试输入的模型是一个简单的线性模型。但由于输入或输出数据不是一维数据集，因此我必须以某种方式提供它能够处理矢量大小。

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")


# Construct a linear model
pred = tf.add(tf.mul(X, W), b)

评估解决方案

一种解决方案是平坦输入和输出，并利用每个帧和特征向量长度一致的事实，并将权重W赋予矩阵，其大小为[frame_length] ，feature_length]，并将偏差的长度更改为feature_length的长度。

这是我尝试这样做的。

############################### Training setup ##################################
# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50

# tf Graph Input
X = tf.placeholder(tf.float32, [None])
Y = tf.placeholder(tf.float32, [None])

X_flatten = tf.reshape(X,[1,-1])
Y_flatten = tf.reshape(Y,[1,-1])

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
W = tf.get_variable(name="W", shape=[train_set_data[0].shape[0],train_set_output[0].shape[0]])

b = tf.Variable(rng.randn(), name="bias")
b = tf.get_variable(name="b",shape=[1,train_set_output[0].shape[0]])

# Construct a linear model
pred = tf.add(tf.matmul(X, W), b)

# Mean squared error
cost = tf.nn.softmax(pred)

# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(train_set_data, train_set_output):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        #Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: train_set_data, Y:train_set_output})
            print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                "W=", sess.run(W), "b=", sess.run(b)

    print "Optimization Finished!"
    training_cost = sess.run(cost, feed_dict={X: train_set_data, Y: train_set_output})
    print "Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n'

    #Graphic display
    plt.plot(train_set_data, train_set_output, 'ro', label='Original data')
    plt.plot(train_set_data, sess.run(W) * train_set_data + sess.run(b), label='Fitted line')
    plt.legend()
    plt.show()

这里的问题是我收到一条错误消息，我不确定我理解..

Traceback (most recent call last):
  File "tensorflow_datapreprocess_mfcc_extraction_rnn.py", line 177, in <module>
    pred = tf.add(tf.matmul(X, W), b)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1036, in matmul
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 911, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 655, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2156, in create_op
    set_shapes_for_outputs(ret)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1612, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/common_shapes.py", line 81, in matmul_shape
    a_shape = op.inputs[0].get_shape().with_rank(2)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 625, in with_rank
    raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (?,) must have rank 2

可以解释为什么我会收到此错误，或者为我提供的错误提供不同的解决方案。我不喜欢这个解决方案，因为我正在改变初始输入/输出结构，而不是使用先前预处理创建的结构。

我应该如何在tensorflow中将此输入提供给我的回归网络

0 个答案: