我正在使用DNN实现,但问题是损失很高,并且在所有迭代后都不会减少。标签形状为torch.Size([1124823]),专长尺寸为torch.Size([1124823,13]),X火车形状torch.Size([719886,13]),X测试形状torch.Size([ 224965,13]),X val形状割炬.Size([179972,13]),y train形状割炬。Size([719886]),y train形状火炬.Size([719886]),y val形状火炬.Size ([179972])
我的Dataloader的实现如下:
X_train, X_test, y_train, y_test = train_test_split(feat, labels, test_size=0.2, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=1)
train = data_utils.TensorDataset(X_train, y_train)
train_loader = data_utils.DataLoader(train, batch_size=1000, shuffle=True)
test = data_utils.TensorDataset(X_test, y_test)
test_loader = data_utils.DataLoader(test, batch_size=1000, shuffle=False)
input_size = 13
hidden1_size = 13
hidden2_size = 64
hidden3_size = 128
hidden4_size = 256
hidden5_size = 1024
output_size = 3989
class DNN(nn.Module):
def __init__(self, input_size, hidden1_size, hidden2_size, hidden3_size, hidden4_size, hidden5_size, output_size):
super(DNN, self).__init__()
self.fc1 = nn.Linear(input_size, hidden1_size)
self.drp1 = nn.Dropout(p=0.2, inplace=False)
self.relu1 = nn.ReLU()
self.tan1 = nn.Tanh()
self.fc2 = nn.Linear(hidden1_size, hidden2_size)
self.drp2 = nn.Dropout(p=0.2, inplace=False)
self.relu2 = nn.ReLU()
self.tan2 = nn.Tanh()
self.fc3 = nn.Linear(hidden2_size, hidden3_size)
self.drp3 = nn.Dropout(p=0.2, inplace=False)
self.relu3 = nn.ReLU()
self.tan3 = nn.Tanh()
self.fc4 = nn.Linear(hidden3_size, hidden4_size)
self.drp4 = nn.Dropout(p=0.2, inplace=False)
self.relu4 = nn.ReLU()
self.tan4 = nn.Tanh()
self.fc5 = nn.Linear(hidden4_size, hidden5_size)
self.drp5 = nn.Dropout(p=0.2, inplace=False)
self.relu5 = nn.ReLU()
self.tan5 = nn.Tanh()
self.fc6 = nn.Linear(hidden5_size, output_size)
self.tan6 = nn.Tanh()
def forward(self, x):
out = self.fc1(x)
out = self.drp1(out)
out = self.relu1(out)
out = self.tan1(out)
out = self.fc2(out)
out = self.drp2(out)
out = self.relu2(out)
out = self.tan2(out)
out = self.fc3(out)
out = self.drp3(out)
out = self.relu3(out)
out = self.tan3(out)
out = self.fc4(out)
out = self.drp4(out)
out = self.relu4(out)
out = self.tan4(out)
out = self.fc5(out)
out = self.drp5(out)
out = self.relu5(out)
out = self.tan5(out)
out = self.fc6(out)
out = self.tan6(out)
return out
batch_size = 10
n_iterations = 50
no_eps = n_iterations / (13 / batch_size)
no_epochs = int(no_eps)
model = DNN(input_size, hidden1_size, hidden2_size, hidden3_size, hidden4_size, hidden5_size, output_size)
criterion = nn.CrossEntropyLoss()
learning_rate = 0.0001
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
iter = 0
for epoch in range(no_epochs):
for i, (X_train, y_train) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(Variable(X_train))
loss = criterion(outputs, Variable(y_train))
print('Iter %d --> loss %f' % (i, loss.item()))
loss.backward()
optimizer.step()
correct = 0
total = 0
print('test')
for X_test, y_test in test_loader:
outputs = model(Variable(X_test))
pred = outputs.argmax(dim=1, keepdim=True)
total += y_test.size(0)
correct += (pred.squeeze() == y_test).sum() # pred.eq(y_test.view_as(pre d)).sum().item()
accuracy = 100 * correct / total
print('Iteration: {}. Accuracy: {}'.format(epoch, accuracy))
答案 0 :(得分:0)
您的损失为nn.CrossEntropy()
。此条件必须以logit作为输入,并且您要输入nn.Tanh()
。
我不知道您如何加载数据(未显示数据),但这表明您正在使用Dataloader
。因此,您无需使用Variable
。您可以发送张量,如:model(X_train)
和criterion(outputs, y_train)
。当您有一个batch_size
变量时,表明您正在使用batch_size,对吗?因此,这一行:
print('Iter %d --> loss %f' % (i, loss.item()))
仅计算一批损失。将其计算到纪元更为现实。您需要累积损失,然后打印平均值,例如:
acc_loss = 0
for x, y, in data_loader:
acc_loss += loss.item()
print("Loss:", acc_loss/len(data_loader))