我是PyTorch的新手,并且我正在一个简单的项目中生成文本,以使用pytorch。我正在使用此代码的概念并将其转换为PyTorch:https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/ 我有10个时间步长和990个样本。对于这990个样本中的每个样本,都有10个值,它们对应于序列的索引(标度)。我的输出样本是其余字母(不包括第一个字母)。例如,如果我的样本为“ Hello Worl”,则相应的输出为“ ello World”。由于我想一次输入一个字母,因此我的输入大小(特征)为1。因此,我的最终输入形状为(990,10 ,1)。然后将输出张量转换为一个热矢量,所以最终形状为(9900,42),其中42是一个热矢量中的元素数。运行输出时,我可以得到一个形状(9900,42),所以这是我所有时间步长的输出,每个时间步长都包含一个对应的一热向量。但是当我计算损失时,会出现错误:
不支持多目标
我能理解我做错了什么吗?谢谢 。下面是我的代码
#The file contains 163780 characters
#The file contains 1000 characters
#There are 42 unique characters
char2int = {char:value for (value,char) in enumerate (unique)}
int2char = {value:char for (value,char) in enumerate (unique)}
learning_rate = 0.01
num_epochs = 5
input_size = 1 #The number of input neurons (features) to our RNN
units = 100
num_layers = 2
num_classes = len(unique) #The number of output neurons
timesteps = 10
datax = []
datay = []
for index in range(0, len(file) - timesteps, 1):
prev_letters = file[index:index + timesteps]
output = file[index + 1: index + timesteps + 1]
#Convert the 10 previous characters to their integers and put in a list. Append that list to the dataset
datax.append([char2int[c] for c in prev_letters])
datay.append([char2int[c] for c in output])
print('There are {} Sequences in the dataset'.format(len(datax)))
#There are 990 Sequences in the dataset
x = np.array(datax)
x = x / float(len(unique))
x = torch.FloatTensor(x)
x = x.view(x.size(0), timesteps, input_size)
print(x.shape) #torch.Size([990, 10, 1])
y = torch.LongTensor(datay)
print(y.shape) #torch.Size([990, 10])
y_one_hot = torch.zeros(y.shape[0] * y.shape[1], num_classes)
index = y.long()
index = index.view(-1,1) #The expected shape for the scatter function
y_one_hot.scatter_(1,index,1) #(dim (1 for along rows and 0 for along cols), index, number to insert)
y_one_hot = y_one_hot.view(-1, num_classes) # Make the tensor of shape(rows, cols)
y_one_hot = y_one_hot.long()
print(y_one_hot.shape)
#torch.Size([9900, 42])
inputs = Variable(x)
labels = Variable(y_one_hot)
class TextGenerator(nn.Module):
def __init__(self,input_size,units,num_layers,num_classes,timesteps):
super(TextGenerator,self).__init__()
self.units = units
self.num_layers = num_layers
self.timesteps = timesteps
self.input_size = input_size
# When batch_first=true, inputs are of shape (batch_size/samples, sequence_length, input_dimension)
self.lstm = nn.LSTM(input_size = input_size, hidden_size = units, num_layers = num_layers, batch_first = True)
#The output layer
self.fc = nn.Linear(units, num_classes)
def forward(self,x):
#Initialize the hidden state
h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
#Initialize the cell state
c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
out,_ = self.lstm(x, (h0,c0))
#Reshape the outout from (samples,timesteps,output_features) to a shape appropriate for the FC layer
out = out.contiguous().view(-1, self.units)
out = self.fc(out)
return out
net = TextGenerator(input_size,units,num_layers,num_classes,timesteps)
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
out = net(inputs)
out.shape #(9900, 42)
loss_fn(out,labels)
答案 0 :(得分:2)
在PyTorch中,当使用CrossEntropyLoss
时,您需要在[0..n_classes-1]
中将输出标签作为整数而不是作为一整向量。 pytorch现在认为您正在尝试预测多个输出。