Tensorflow中我的RBF网络存在问题?

时间:2017-08-24 05:49:46

标签: python tensorflow

我正在使用Tensorflow处理RBF网络,但是第112行出现了这样的错误:ValueError:无法为Tensor'占位符:0'提供形状值(40,13),形状'(?,12)'

以下是我的代码。我按照this tutorial为我的RBF网络创建了自己的激活功能。此外,如果您还有其他任何需要修复的内容,请向我指出,因为我对Tensorflow很新,所以获得我能得到的任何反馈会很有帮助。

import tensorflow as tf
import numpy as np
import math
from sklearn import datasets
from sklearn.model_selection import train_test_split
from tensorflow.python.framework import ops
ops.reset_default_graph()

RANDOM_SEED = 42
tf.set_random_seed(RANDOM_SEED)

boston = datasets.load_boston()

data = boston["data"]
target = boston["target"]

N_INSTANCES = data.shape[0]
N_INPUT = data.shape[1] - 1
N_CLASSES = 3
TEST_SIZE = 0.1
TRAIN_SIZE = int(N_INSTANCES * (1 - TEST_SIZE))
batch_size = 40
training_epochs = 400
learning_rate = 0.001
display_step = 20
hidden_size = 200

target_ = np.zeros((N_INSTANCES, N_CLASSES))

data_train, data_test, target_train, target_test = train_test_split(data, target_, test_size=0.1, random_state=100)

x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, N_CLASSES], dtype=tf.float32)

# creates activation function
def gaussian_function(input_layer):
    initial = math.exp(-2*math.pow(input_layer, 2))
    return initial

np_gaussian_function = np.vectorize(gaussian_function)

def d_gaussian_function(input_layer):
    initial = -4 * input_layer * math.exp(-2*math.pow(input_layer, 2))
    return initial

np_d_gaussian_function = np.vectorize(d_gaussian_function)

np_d_gaussian_function_32 = lambda input_layer: np_d_gaussian_function(input_layer).astype(np.float32)

def tf_d_gaussian_function(input_layer, name=None):
    with ops.name_scope(name, "d_gaussian_function", [input_layer]) as name:
        y = tf.py_func(np_d_gaussian_function_32, [input_layer],[tf.float32], name=name, stateful=False)
    return y[0]

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
    rnd_name = 'PyFunGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

def gaussian_function_grad(op, grad):
    input_variable = op.inputs[0]
    n_gr = tf_d_gaussian_function(input_variable)
    return grad * n_gr

np_gaussian_function_32 = lambda input_layer: np_gaussian_function(input_layer).astype(np.float32)

def tf_gaussian_function(input_layer, name=None):
    with ops.name_scope(name, "gaussian_function", [input_layer]) as name:
        y = py_func(np_gaussian_function_32, [input_layer], [tf.float32], name=name, grad=gaussian_function_grad)
    return y[0]
# end of defining activation function

def rbf_network(input_layer, weights):
    layer1 = tf.matmul(tf_gaussian_function(input_layer), weights['h1'])
    layer2 = tf.matmul(tf_gaussian_function(layer1), weights['h2'])
    output = tf.matmul(tf_gaussian_function(layer2), weights['output'])
    return output

weights = {
    'h1': tf.Variable(tf.random_normal([N_INPUT, hidden_size], stddev=0.1)),
    'h2': tf.Variable(tf.random_normal([hidden_size, hidden_size], stddev=0.1)),
    'output': tf.Variable(tf.random_normal([hidden_size, N_CLASSES], stddev=0.1))
}

pred = rbf_network(x_data, weights)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)

# Training loop
for epoch in range(training_epochs):
    avg_cost = 0.
    total_batch = int(data_train.shape[0] / batch_size)
    for i in range(total_batch):
        randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
        batch_xs = data_train[randidx, :]
        batch_ys = target_train[randidx, :]

        sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
        avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch

        if epoch % display_step == 0:
            print("Epoch: %03d/%03d cost: %.9f" % (epoch, training_epochs, avg_cost))
            train_accuracy = sess.run(accuracy, feed_dict={x_data: batch_xs, y_target: batch_ys})
            print("Training accuracy: %.3f" % train_accuracy)

test_acc = sess.run(accuracy, feed_dict={x_data: data_test, y_target: target_test})
print("Test accuracy: %.3f" % (test_acc))

sess.close()

3 个答案:

答案 0 :(得分:3)

如前所述,您应该N_Input = data.shape[1]

实际上data.shape[0]与您在数据集中的实现次数有关,data.shape[1]告诉我们网络应该考虑多少功能。

根据定义,功能的数量是输入层的大小,无论您将向网络建议多少数据(通过feed_dict)。

Plus boston数据集是回归问题,而softmax_cross_entropy是分类问题的成本函数。您可以尝试tf.square来评估您预测的内容与您想要的内容之间的欧氏距离:

cost = tf.reduce_mean(tf.square(pred - y_target))

即使准确度不是很高,您也会看到您的网络正在学习。

修改:

您的代码实际上是在学习,但您使用错误的工具来衡量它。

主要是,您的错误仍然存​​在于您正在处理回归问题而不是分类问题。

在分类问题中,您可以使用

评估正在进行的学习过程的准确性
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

它包括检查预测的类是否与预期的类相同,用于x_test中的输入。

在回归问题中,这样做是没有意义的,因为你正在寻找真实数字,即从分类角度来看无限可能。

在回归问题中,您可以估算预测值与预期值之间的错误(平均值或其他值)。我们可以使用下面建议的内容:

cost = tf.reduce_mean(tf.square(pred - y_target))

我修改了你的代码,因此它是

pred = rbf_network(x_data, weights)

cost = tf.reduce_mean(tf.square(pred - y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

#correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
#accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)

plt.figure("Error evolution")
plt.xlabel("N_epoch")
plt.ylabel("Error evolution")
tol = 5e-4
epoch, err=0, 1
# Training loop
while epoch <= training_epochs and err >= tol:
    avg_cost = 0.
    total_batch = int(data_train.shape[0] / batch_size)
    for i in range(total_batch):
        randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
        batch_xs = data_train[randidx, :]
        batch_ys = target_train[randidx, :]

        sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
        avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch
    plt.plot(epoch, avg_cost, marker='o', linestyle="none", c='k')
    plt.pause(0.05)
    err = avg_cost
    if epoch % 10 == 0:
        print("Epoch: {}/{} err = {}".format(epoch, training_epochs, avg_cost))

    epoch +=1

print ("End of learning process")
print ("Final epoch = {}/{} ".format(epoch, training_epochs))
print ("Final error = {}".format(err) )
sess.close()

输出

Epoch: 0/400 err = 0.107879924503
Epoch: 10/400 err = 0.00520248359747
Epoch: 20/400 err = 0.000651647908274
End of learning process

Final epoch = 26/400 
Final error = 0.000474644409471

我们通过不同的时期enter image description here

绘制训练中错误的演变

答案 1 :(得分:2)

我也是Tensorflow的新手,这是我在stackoverflow中的第一个答案。我尝试了你的代码,但我得到了同样的错误。

您可以在错误代码ValueError: Cannot feed value of shape (40, 13) for Tensor 'Placeholder:0', which has shape '(?, 12)中看到第一个占位符的形状不匹配:

x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)

所以我不确定为什么N_INPUT在此行中有-1

N_INPUT = data.shape[1] - 1

我尝试删除它并运行代码。虽然看起来网络没有学习。

答案 2 :(得分:1)

虽然这个实现可以完成这项工作,但我认为它不是最优化的RBF实现。您在RBF中使用固定大小的200个质心(隐藏单位)。这会导致质心不能最佳地放置,并且高斯基函数的宽度不能达到最佳尺寸。通常,应使用K Means或任何其他类型的聚类算法在无人监督的前期阶段学习质心。

因此,您的第一个培训阶段将涉及找到RBF的质心/中心,第二阶段将是使用RBF网络的实际分类/回归