Question

我是PyTorch的新手，并且我正在一个简单的项目中生成文本，以使用pytorch。我正在使用此代码的概念并将其转换为PyTorch：https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/ 我有10个时间步长和990个样本。对于这990个样本中的每个样本，都有10个值，它们对应于序列的索引（标度）。我的输出样本是其余字母（不包括第一个字母）。例如，如果我的样本为“ Hello Worl”，则相应的输出为“ ello World”。由于我想一次输入一个字母，因此我的输入大小（特征）为1。因此，我的最终输入形状为（990，10 ，1）。然后将输出张量转换为一个热矢量，所以最终形状为（9900，42），其中42是一个热矢量中的元素数。运行输出时，我可以得到一个形状（9900，42），所以这是我所有时间步长的输出，每个时间步长都包含一个对应的一热向量。但是当我计算损失时，会出现错误：

不支持多目标

我能理解我做错了什么吗？谢谢。下面是我的代码

#The file contains 163780 characters
#The file contains 1000 characters
#There are 42 unique characters

char2int = {char:value for (value,char) in enumerate (unique)}
int2char = {value:char for (value,char) in enumerate (unique)}

learning_rate = 0.01
num_epochs = 5         
input_size = 1              #The number of input neurons (features) to our RNN
units = 100                
num_layers = 2              
num_classes = len(unique)   #The number of output neurons

timesteps = 10
datax = []
datay = []
for index in range(0, len(file) - timesteps, 1):
    prev_letters = file[index:index + timesteps]
    output = file[index + 1: index + timesteps + 1]
    #Convert the 10 previous characters to their integers and put in a list. Append that list to the dataset
    datax.append([char2int[c] for c in prev_letters]) 
    datay.append([char2int[c] for c in output])
print('There are {} Sequences in the dataset'.format(len(datax)))
#There are 990 Sequences in the dataset

x = np.array(datax)
x = x / float(len(unique))
x = torch.FloatTensor(x)
x = x.view(x.size(0), timesteps, input_size)
print(x.shape)   #torch.Size([990, 10, 1])

y = torch.LongTensor(datay)
print(y.shape)   #torch.Size([990, 10])
y_one_hot = torch.zeros(y.shape[0] * y.shape[1], num_classes)
index = y.long()
index = index.view(-1,1)          #The expected shape for the scatter function
y_one_hot.scatter_(1,index,1)    #(dim (1 for along rows and 0 for along cols), index, number to insert)
y_one_hot = y_one_hot.view(-1, num_classes)    # Make the tensor of shape(rows, cols)
y_one_hot = y_one_hot.long()
print(y_one_hot.shape)
#torch.Size([9900, 42])

inputs = Variable(x)
labels = Variable(y_one_hot)

class TextGenerator(nn.Module):
    def __init__(self,input_size,units,num_layers,num_classes,timesteps):
        super(TextGenerator,self).__init__()
        self.units = units
        self.num_layers = num_layers
        self.timesteps = timesteps
        self.input_size = input_size
        # When batch_first=true, inputs are of shape (batch_size/samples, sequence_length, input_dimension)
        self.lstm = nn.LSTM(input_size = input_size, hidden_size = units, num_layers = num_layers, batch_first = True)
        #The output layer 
        self.fc = nn.Linear(units, num_classes)
    def forward(self,x):
        #Initialize the hidden state
        h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
        #Initialize the cell state 
        c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
        out,_ = self.lstm(x, (h0,c0))
        #Reshape the outout from (samples,timesteps,output_features) to a shape appropriate for the FC layer
        out = out.contiguous().view(-1, self.units)
        out = self.fc(out)
        return out

net = TextGenerator(input_size,units,num_layers,num_classes,timesteps)
loss_fn = torch.nn.CrossEntropyLoss() 
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
out = net(inputs)
out.shape   #(9900, 42)
loss_fn(out,labels)

Answer 1

在PyTorch中，当使用CrossEntropyLoss时，您需要在[0..n_classes-1]中将输出标签作为整数而不是作为一整向量。 pytorch现在认为您正在尝试预测多个输出。

LSTM中的输入和标签的PyTorch张量

1 个答案: