Question

我有一个带有MSE损失功能的神经网络，实现的方式如下：

# input x_ph is of size Nx1 and output should also be of size Nx1
def train_neural_network_batch(x_ph, predict=False):
    prediction = neural_network_model(x_ph)

    # MSE loss function
    cost = tf.reduce_mean(tf.square(prediction - y_ph))

    optimizer = tf.train.AdamOptimizer(learn_rate).minimize(cost)

    # mini-batch optimization here

我对神经网络和Python还是很陌生，但是我知道每次迭代都会将训练点的样本输入到神经网络中，并在该样本中的点处评估损失函数。但是，我希望能够修改损失函数，以便对某些数据进行更重的加权。我的意思的伪代码

# manually compute the MSE of the data without the first sampled element
cost = 0.0
for ii in range(1,len(y_ph)):
    cost += tf.square(prediction[ii] - y_ph[ii])

cost = cost/(len(y_ph)-1.0)

# weight the first sampled data point more heavily according to some parameter W
cost += W*(prediction[0] - y_ph[0])

我可能还有更多要加权的点，但是现在，我只是想知道如何在tensorflow中实现这样的功能。我知道len(y_ph)是无效的，因为y_ph只是一个占位符，我不能仅仅做y_ph[i]或prediction[i]之类的事情。

Answer 1

您可以通过多种方式执行此操作：

1）如果某些数据实例的权重仅是普通实例的2倍或3倍，则可以将这些实例多次复制到数据集中。因此，他们将损失更多的重量，从而满足您的意图。这是最简单的方法。

2）如果您的加权更为复杂，请说一个浮动加权。您可以定义权重的占位符，将其乘以损失，然后使用feed_dict在会话中将权重与x批和y批一起提供。只要确保instance_weight与batch_size大小相同

例如

import tensorflow as tf
import numpy as np

with tf.variable_scope("test", reuse=tf.AUTO_REUSE):
  x = tf.placeholder(tf.float32, [None,1])
  y = tf.placeholder(tf.float32, [None,1])
  instance_weight = tf.placeholder(tf.float32, [None,1])
  w1 = tf.get_variable("w1", shape=[1, 1])
  prediction = tf.matmul(x, w1)

  cost = tf.square(prediction - y)
  loss = tf.reduce_mean(instance_weight * cost)
  opt = tf.train.AdamOptimizer(0.5).minimize(loss)

with tf.Session() as sess:
  x1 = [[1.],[2.],[3.]]
  y1 = [[2.],[4.],[3.]]
  instance_weight1 = [[10.0], [10.0], [0.1]]
  sess.run(tf.global_variables_initializer())
  print (x1)
  print (y1)
  print (instance_weight1)
  for i in range(1000):
    _, loss1, prediction1 = sess.run([opt, loss, prediction], feed_dict={instance_weight : instance_weight1, x : x1, y : y1 }) 
    if (i % 100) == 0:
      print(loss1)
      print(prediction1)

注意instance_weight1，您可以更改instance_weight1以查看差异（此处batch_size设置为3）

其中x1，y1和x2，y2遵循规则y = 2 * x

x3，y3遵循规则y = x

但是在权重为[10,10,0.1]的情况下，预测1覆盖y1，y2规则而几乎忽略了y3，输出为：

[[1.9823183]
 [3.9646366]
 [5.9469547]]

PS：在张量流图中，强烈建议不要使用循环，而应使用矩阵运算符来并行计算。

在张量流中访问占位符的元素

1 个答案: