tensorflow,mini-batch,tf.placeholder - 在给定的迭代中读取节点的状态

时间:2018-01-13 20:17:29

标签: python tensorflow mini-batch

我想在每个纪元/批次组合中打印MSE的值。下面的代码报告了代表mse的张量对象,而不是每次迭代时的值:

print("Epoch", epoch, "Batch_Index", batch_index, "MSE:", mse)

输出示例:

  

Epoch 0 Batch_Index 0 MSE:Tensor(“mse_2:0”,shape =(),dtype = float32)

我理解这是因为MSE引用了tf.placeholder节点,这些节点本身没有任何数据。但是,一旦我运行以下代码:

sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

数据应该已经可用,因此取决于该数据的所有节点的值也应该是可访问的,我认为请求在print语句中评估MSE会导致错误

print("Epoch", epoch, "Batch_Index", batch_index, "MSE:", mse.eval())

输出2:

  

InvalidArgumentError:您必须使用dtype float和shape [?,9]为占位符张量'X_2'提供值   ...

这告诉我mse.eval()没有看到sess.run()

中定义的数据

为什么我们会遇到这种行为? 我们应该如何更改代码以使其在每次指定的迭代中报告MSA?

import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data] # ADD COLUMN OF 1s for BIAS!

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")

y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")

optimizer =  tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()

n_epochs = 100
batch_size = 100
n_batches = int(np.ceil(m / batch_size))
learning_rate = 0.01

def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch, y_batch

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
            if (epoch % 50 == 0 and batch_index % 100 == 0):
                print("Epoch", epoch, "Batch_Index", batch_index, "MSE:", mse)
    best_theta = theta.eval()

best_theta

2 个答案:

答案 0 :(得分:0)

首先,我认为这种调试和打印方式更容易在tensorflow中启用急切执行。

如果没有启用急切执行,tensorflow中的“print”将永远不会打印张量的动态值;它只会打印张量的名称,这很少是你想要的。相反,使用tf.Print检查张量的运行时值(通过像tensor = tf.Print(tensor, [tensor])那样执行tf.Print不执行,除非在某处使用输出。)

答案 1 :(得分:0)

我通过将print语句修改为以下内容来实现它:

print("Epoch", epoch, "Batch_Index", batch_index, "MSE:", mse.eval(feed_dict={X: scaled_housing_data_plus_bias, y: housing_target}))

此外,通过引用完整的数据集(不是批次),我能够测试当前基于批次的模型对整个样本的推广。随着模型训练的进展,应该很容易将其扩展到测试和保持样本的测试

我担心这种即时评估(即使是批次)也会对模型的性能产生影响。我会做进一步测试。