Question

我对TensorFlow很新，我试着理解占位符的概念。假设我有一个100x4形状的功能集。所以我有100行4种不同的功能。然后目标是100x1的形状。如果我想将两个矩阵用作训练集。我所做的是：

X = tf.placeholder(tf.float64, shape=X_train.shape)
Y = tf.placeholder(tf.float64, shape=y_train.shape)

W = tf.Variable(tf.random_normal([4, 1]), name="weight",dtype=tf.float32)
b = tf.Variable(rng.randn(), name="bias",dtype=tf.float32)

pred = tf.add(tf.multiply(X, W), b)

cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

init = tf.global_variables_initializer()

with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        for (x, y) in zip(X_train, y_train):
            sess.run(optimizer, feed_dict={X: x, Y: y})
            ... # some plotting and printing of results

然后导致“ValueError：无法为Tensor'占位符：0'提供形状值（...，），其形状为'（...，...）'”。更具体地说，成本函数中“sub”的维度不相等。

有人可以解释如何继续以及为什么？提前致谢

Answer 1

了解占位符的概念

占位符需要为将来提供的实际数据保留一个位置：

x = tf.placeholder(tf.float32, shape=X_train.shape)
logits = nn(x)  # making some operations with x in order to calculate logits

s = tf.Session()
logits = s.run(logits, feed_dict={x: X_train})

因为我们使用占位符来制作日志，我们需要放置实际数据而不是占位符才能计算logits

“ValueError：无法为Tensor'占位符：0'提供形状值（...，），其形状为'（...，...）'”

在feed_dict={x: X_train}中，您的占位符x排名第二，X_train排名第一。最好仔细检查一下你的数据。

Answer 2

如果您想以批次训练数据，则应使用占位符。

<强>为什么吗
如果您有大型数据集，则会执行此操作，例如，如果您希望针对图像分类问题训练分类器，但无法将所有训练图像加载到内存中。相反，通过batch gradient descent训练您的模型。通过这种技术，每次只加载一批图像，并且仅在该批次上执行反向传播。这需要更多的时期收敛到最小值，但每个时代的训练速度更快。

如何吗
您首先为训练示例X定义一个占位符，为其标记Y定义一个占位符，在您的情况下分别为(batch_size, 4)和(batch_size, 1)形状。
然后，当您想要训练模型时，您应该通过提要词典将数据提供给占位符：

with tf.Session() as sess:
    sess.run(train_op, feed_dict={X:x_batch, Y:y_batch}) # train_op is the operation that minimizes your cost function

其中x_batch和y_batch应该是来自X_train和Y_train数组的随机批次，但不是100个例子，而是应该batch_size示例（以便它们的尺寸与占位符＆＃39;维度相匹配）。

为什么你不应该在你的情况下这样做？
由于你有一个小数据集，它已经加载到你的记忆中，你可以使用常规的梯度下降。

如何吗
只需使用变量（tf.Variable()）而不是占位符。

X = tf.Variable(X_train)
Y = tf.Variable(Y_train)

这将创建两个变量类型张量，初始化时将分别采用X_train和Y_train的形状和值。

不要忘记在会话中初始化：

with tf.Session() as sess:
     sess.run(tf.global_variables_initializer()) # initialize variables
     sess.run(train_op) # no need for a feed_dict

关于占位符形状的Tensorflow错误

2 个答案: