我正在尝试编写基于误报率和误报率的自定义损失函数。我编写了一个伪代码,因此您也可以检查前两个定义。我添加了其余的内容,因此您可以看到它是如何实现的。但是,仍然在某个地方梯度变为零。现在渐变变为零的步骤是什么,或者该如何检查?请我想知道如何解决这个问题:)。 我尝试为您提供更多信息,以便您也可以玩,但如果您错过了任何事情,请告诉我!
在每个步骤中,梯度保持为True。但是,仍然在模型训练期间,损失不会更新,因此NN不会训练。
y = Variable(torch.tensor((0, 0, 0, 1, 1,1), dtype=torch.float), requires_grad = True)
y_pred = Variable(torch.tensor((0.333, 0.2, 0.01, 0.99, 0.49, 0.51), dtype=torch.float), requires_grad = True)
x = Variable(torch.tensor((0, 0, 0, 1, 1,1), dtype=torch.float), requires_grad = True)
x_pred = Variable(torch.tensor((0.55, 0.25, 0.01, 0.99, 0.65, 0.51), dtype=torch.float), requires_grad = True)
def binary_y_pred(y_pred):
y_pred.register_hook(lambda grad: print(grad))
y_pred = y_pred+torch.tensor(0.5, requires_grad=True, dtype=torch.float)
y_pred = y_pred.pow(5) # this is my way working around using torch.where()
y_pred = y_pred.pow(10)
y_pred = y_pred.pow(15)
m = nn.Sigmoid()
y_pred = m(y_pred)
y_pred = y_pred-torch.tensor(0.5, requires_grad=True, dtype=torch.float)
y_pred = y_pred*2
y_pred.register_hook(lambda grad: print(grad))
return y_pred
def confusion_matrix(y_pred, y):
TP = torch.sum(y*y_pred)
TN = torch.sum((1-y)*(1-y_pred))
FP = torch.sum((1-y)*y_pred)
FN = torch.sum(y*(1-y_pred))
k_eps = torch.tensor(1e-12, requires_grad=True, dtype=torch.float)
FN_rate = FN/(TP + FN + k_eps)
FP_rate = FP/(TN + FP + k_eps)
return FN_rate, FP_rate
def dif_rate(FN_rate_y, FN_rate_x):
dif = (FN_rate_y - FN_rate_x).pow(2)
return dif
def custom_loss_function(y_pred, y, x_pred, x):
y_pred = binary_y_pred(y_pred)
FN_rate_y, FP_rate_y = confusion_matrix(y_pred, y)
x_pred= binary_y_pred(x_pred)
FN_rate_x, FP_rate_x = confusion_matrix(x_pred, x)
FN_dif = dif_rate(FN_rate_y, FN_rate_x)
FP_dif = dif_rate(FP_rate_y, FP_rate_x)
cost = FN_dif+FP_dif
return cost
# I added the rest so you can see how it is implemented, but this peace does not fully run well! If you want this part to run as well, I can add more code.
class FeedforwardNeuralNetModel(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(FeedforwardNeuralNetModel, self).__init__()
self.fc1 = nn.Linear(input_dim, hidden_dim)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear(hidden_dim, output_dim)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
out = self.fc1(x)
out = self.relu1(out)
out = self.fc2(out)
out = self.sigmoid(out)
return out
model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001, betas=[0.9, 0.99], amsgrad=True)
criterion = torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean')
for epoch in range(num_epochs):
train_err = 0
for i, (samples, truths) in enumerate(train_loader):
samples = Variable(samples)
truths = Variable(truths)
optimizer.zero_grad() # Reset gradients
outputs = model(samples) # Do the forward pass
loss2 = criterion(outputs, truths) # Calculate loss
samples_y = Variable(samples_y)
samples_x = Variable(samples_x)
y_pred = model(samples_y)
y = Variable(y, requires_grad=True)
x_pred = model(samples_x)
x= Variable(x, requires_grad=True)
cost = custom_loss_function(y_pred, y, x_pred, x)
loss = loss2*0+cost #checking only if cost works.
loss.backward()
optimizer.step()
train_err += loss.item()
train_loss.append(train_err)
我希望模型在训练期间会更新。没有错误消息。
答案 0 :(得分:1)
使用您的定义:TP+FN=y
和TN+FP=1-y
。然后您将得到FN_rate=1-y_pred
和FP_rate=y_pred
。这样,您的成本为FN_rate+FP_rate=1
,其梯度为0。
您可以手动检查此问题,也可以使用符号数学库(例如SymPy)进行检查:
from sympy import symbols
y, y_pred = symbols("y y_pred")
TP = y * y_pred
TN = (1-y)*(1-y_pred)
FP = (1-y)*y_pred
FN = y*(1-y_pred)
# let's ignore the eps for now
FN_rate = FN/(TP + FN)
FP_rate = FP/(TN + FP)
cost = FN_rate + FP_rate
from sympy import simplify
print(simplify(cost))
# output: 1