我目前有一个经过训练的 DNN,可以对游戏所处的状态进行单热编码分类的预测。基本上,假设有三种状态,0, 1, or 2.
现在,我通常会使用 categorical_cross_entropy
作为损失函数,但我意识到并非所有分类对于我的状态都是平等的。例如:
我知道我们可以在 Keras 中声明我们的自定义损失函数,但我一直坚持要形成它。有人对如何转换该伪代码有建议吗?我不知道如何在向量操作中做到这一点。
附加问题:我认为我基本上是在追求奖励功能。这和损失函数一样吗?谢谢!
def custom_expectancy(y_expected, y_pred):
# Get 0, 1 or 2
expected_norm = tf.argmax(y_expected);
predicted_norm = tf.argmax(y_pred);
# Some pseudo code....
# Now, if predicted == 1
# loss += 0
# elif predicted == expected
# loss -= 3
# elif predicted != expected
# loss += 1
#
# return loss
参考来源:
Custom loss in Keras with softmax to one-hot
代码更新
import tensorflow as tf
def custom_expectancy(y_expected, y_pred):
# Get 0, 1 or 2
expected_norm = tf.argmax(y_expected);
predicted_norm = tf.argmax(y_pred);
results = tf.unstack(expected_norm)
# Some pseudo code....
# Now, if predicted == 1
# loss += 0
# elif predicted == expected
# loss += 3
# elif predicted != expected
# loss -= 1
for idx in range(0, len(expected_norm)):
predicted = predicted_norm[idx]
expected = expected_norm[idx]
if predicted == 1: # do nothing
results[idx] = 0.0
elif predicted == expected: # reward
results[idx] = 3.0
else: # wrong, so we lost
results[idx] = -1.0
return tf.stack(results)
我认为这就是我所追求的,但我还没有完全弄清楚如何构建正确的张量(应该是批量大小)以返回。
答案 0 :(得分:1)
构建条件自定义损失的最佳方法是使用 curl http://config-server-host:8888/my-application/dev/logback.xml?useDefaultLabel
而不涉及循环。
在您的情况下,您应该组合 2 个 switch 条件表达式以获得所需的结果。
可以通过这种方式重现所需的损失函数:
tf.keras.backend.switch
其中 def custom_expectancy(y_expected, y_pred):
zeros = tf.cast(tf.reduce_sum(y_pred*0, axis=-1), tf.float32) ### important to produce gradient
y_expected = tf.cast(tf.reshape(y_expected, (-1,)), tf.float32)
class_pred = tf.argmax(y_pred, axis=-1)
class_pred = tf.cast(class_pred, tf.float32)
cond1 = (class_pred != y_expected) & (class_pred != 1)
cond2 = (class_pred == y_expected) & (class_pred != 1)
res1 = tf.keras.backend.switch(cond1, zeros -1, zeros)
res2 = tf.keras.backend.switch(cond2, zeros +3, zeros)
return res1 + res2
是模型错误预测状态 0 或 2 时,cond1
是模型正确预测状态 0 或 2 时。标准状态为零,当 cond2
时返回和 cond1
未激活。
您可以注意到,cond2
可以作为一个简单的张量/整数编码状态数组传递(无需对它们进行一次性处理)。
损失函数的工作原理如下:
y_expected
哪个返回:
true = tf.constant([[1], [2], [1], [0] ]) ## no need to one-hot
pred = tf.constant([[0,1,0],[0,0,1],[0,0,1],[0,1,0]])
custom_expectancy(true, pred)
这似乎符合我们的需求。
在模型中使用损失:
<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 0., 3., -1., 0.], dtype=float32)>
Here 跑步笔记本
答案 1 :(得分:0)
Here there is a nice post explaining the concepts of the loss function and cost function。多个答案说明了机器学习领域的不同作者如何看待它们。
至于损失函数,你可能会发现the following implementation useful。它实现了加权交叉熵损失,您可以根据训练中的权重按比例对每个类进行权重。这可以进行调整以满足上面指定的约束。
答案 2 :(得分:0)
这是您想要的方式。如果您的基本事实 y_true 是密集的(形状为 N3),您可以使用 tf.reduce_all(y_true == [0.0, 0.0, 1.0], axis=-1, keepdims=True)
和 tf.reduce_all(y_true == [1.0, 0.0, 0.0], axis=-1, keepdims=True)
来控制 if/elif/else。您可以使用 tf.gather 进一步优化它。
def sparse_loss(y_true, y_pred):
"""Calculate loss for game. Follows keras loss signature.
Args:
y_true: Sparse tensor of shape N1, where correct prediction
is encoded as 0, 1, or 2.
y_pred: Tensor of shape N3. For each row, the three columns
represent the predicted probability of each state.
For example, [0.1, 0.4, 0.6] means, "There's a 10% chance the
right state is 0; 40% chance the right state is 1,
and 60% chance the right state is 2".
"""
# This is the unvectorized implementation on individual rows which is more
# intuitive. But TF requires vectorization.
# if y_true == 0:
# # Value matrix is shape 3. Broadcasting will occur.
# return -tf.reduce_sum(y_pred * [3.0, 0.0, -1.0])
# elif y_true == 2:
# return -tf.reduce_sum(y_pred * [-1.0, 0.0, 3.0])
# else:
# # According to the rules, this is never the correct
# # state the predict so it should never show up.
# assert False, f'Impossible state reached. y_true: {y_true}, y_pred: {y_pred}.'
# We vectorize by calculating the reward for all predictions for two cases:
# if y_true is zero or if y_true is two. To eliminate this inefficiency, we
# could us tf.gather to build an N3 shaped matrix to multiply against.
reward_for_true_zero = tf.reduce_sum(y_pred * [3.0, 0.0, -1.0], axis=-1, keepdims=True) # N1
reward_for_true_two = tf.reduce_sum(y_pred * [-1.0 ,0.0, 3.0], axis=-1, keepdims=True) # N1
reward = tf.where(y_true == 0.0, reward_for_true_zero, reward_for_true_one) # N1
return -tf.reduce_sum(reward)