如何用numpy数据和批量大小训练pytorch模型?

时间:2017-09-12 07:49:51

标签: python pytorch

我正在学习pytorch的基础知识,并考虑创建一个简单的4层神经网络,带有辍学训练IRIS数据集进行分类。在参考了许多教程后,我编写了这段代码。

import pandas as pd
from sklearn.datasets import load_iris
import torch
from torch.autograd import Variable

epochs=300
batch_size=20
lr=0.01

#loading data as numpy array
data = load_iris()
X=data.data
y=pd.get_dummies(data.target).values

#convert to tensor
X= Variable(torch.from_numpy(X), requires_grad=False)
y=Variable(torch.from_numpy(y), requires_grad=False)
print(X.size(),y.size())

#neural net model
model = torch.nn.Sequential(
    torch.nn.Linear(4, 10),
    torch.nn.ReLU(),
    torch.nn.Dropout(),
    torch.nn.Linear(10, 5),
    torch.nn.ReLU(),
    torch.nn.Dropout(),
    torch.nn.Linear(5, 3),
    torch.nn.Softmax()
)

print(model)

# Loss and Optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=lr)  
loss_func = torch.nn.CrossEntropyLoss()  

for i in range(epochs):
    # Forward pass
    y_pred = model(X)

    # Compute and print loss.
    loss = loss_func(y_pred, y)
    print(i, loss.data[0])

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable weights
    # of the model)
    optimizer.zero_grad()

    # Backward pass
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its parameters
    optimizer.step()

目前我遇到两个问题。

  1. 我想将批量大小设置为20。我该怎么做?
  2. 此步骤y_pred = model(X)显示此错误
  3. 错误

     TypeError: addmm_ received an invalid combination of arguments - got (int, int, torch.DoubleTensor, torch.FloatTensor), but expected one of:
     * (torch.DoubleTensor mat1, torch.DoubleTensor mat2)
     * (torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
     * (float beta, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
     * (float alpha, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
     * (float beta, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
     * (float alpha, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
     * (float beta, float alpha, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
          didn't match because some of the arguments have invalid types: (int, int, torch.DoubleTensor, !torch.FloatTensor!)
     * (float beta, float alpha, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
          didn't match because some of the arguments have invalid types: (int, int, !torch.DoubleTensor!, !torch.FloatTensor!)
    

2 个答案:

答案 0 :(得分:4)

  

我想将批量大小设置为20.我该怎么做?

对于数据处理和加载,PyTorch提供两个类,一个是Dataset,用于表示数据集。具体来说,Dataset提供了使用样本索引从整个数据集中获取一个样本的接口。

但是Dataset是不够的,对于大型数据集,我们需要进行批处理。因此,PyTorch提供了第二个类Dataloader,用于根据批量大小和其他参数从Dataset生成批次。

对于您的具体情况,我认为您应该尝试TensorDataset。然后使用Dataloader将批量大小设置为20.只需查看PyTorch official examples即可了解如何执行此操作。

  

在此步骤y_pred = model(X)显示此错误

错误消息非常有用。您对模型的输入XDoubleTensor类型。但您的模型参数的类型为FloatTensor。在PyTorch中,您无法在不同类型的张量之间进行操作。你应该做的是替换

X= Variable(torch.from_numpy(X), requires_grad=False)

X= Variable(torch.from_numpy(X).float(), requires_grad=False)

现在,X的类型为FloatTensor,错误消息应该消失。

此外,作为一个温和的提醒,互联网上有很多关于你的问题的材料可以充分解决你的问题。你应该努力自己解决它。

答案 1 :(得分:0)

可能同样的问题:Pytorch: Convert FloatTensor into DoubleTensor

简而言之:从numpy转换时,值存储在DoubleTensor中,而优化器需要FloatTensor。你必须改变其中一个。