Torchvision 0.2.1 transforms.Normalize无法正常工作

时间:2018-11-16 06:37:50

标签: python deep-learning pytorch normalize torchvision

我正在尝试使用Pytorch编写新代码。在此代码中,要加载数据集(CIFAR10),我使用的是Torchvision的数据集。我定义了两个转换函数ToTensor()和Normalize()。规范化后,我希望数据集中的数据应介于0到1之间。但是最大值仍为255。我还在transforms.py(Lib \ site-packages \ torchvision \ transforms \ transforms.py)。运行代码时也不会打印此打印。不知道发生了什么。我在互联网上访问的每个页面都提到了与我几乎相同的用法。例如我访问过的一些网站 https://github.com/adventuresinML/adventures-in-ml-code/blob/master/pytorch_nn.py https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py

我的代码如下。这将读取带有和不带有Normalize的数据集,然后输出一些统计信息。印刷的最小值和最大值指示数据是否已标准化。

import torchvision as tv
import numpy as np

dataDir = 'D:\\general\\ML_DL\\datasets\\CIFAR'

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)

输出看起来像

[ 0.49139968  0.48215841  0.44653091]
0
255
(50000, 32, 32, 3)
[ 0.49139968  0.48215841  0.44653091]
0
255
(50000, 32, 32, 3)

请帮助我更好地理解这一点。正如我尝试的大多数功能一样,最终会得到相似的结果-例如Grayscale,CenterCrop。

2 个答案:

答案 0 :(得分:2)

因此,在代码中,您制定了一个计划如何处理数据。您已经创建了一个数据管道,数据将通过该管道流动,并且将应用多个转换。

但是,您忘记了致电torch.utils.data.DataLoader。在调用此方法之前,将不会对数据进行转换。您可以详细了解here

现在,当我们将上述内容添加到您的代码中时,如下所示-

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), 
                  tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])

trainSet = tv.datasets.CIFAR10(root=dataDir, train=True,
                                    download=False, transform=trainTransform)

dataloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=False, num_workers=4)

并打印以下图像-

images, labels = iter(dataloader).next()
print images
print images.max()
print images.min()

我们得到Tensors所应用的变换。

输出的一小段

[[ 1.8649,  1.8198,  1.8348,  ...,  0.3924,  0.3774,  0.2572],
      [ 1.9701,  1.9550,  1.9851,  ...,  0.7230,  0.6929,  0.6629],
      [ 2.0001,  1.9550,  2.0001,  ...,  0.7831,  0.7530,  0.7079],
      ...,
      [-0.8096, -1.0049, -1.0350,  ..., -1.3355, -1.3655, -1.4256],
      [-0.7796, -0.8697, -0.9749,  ..., -1.2754, -1.4557, -1.5609],
      [-0.7645, -0.7946, -0.9298,  ..., -1.4106, -1.5308, -1.5909]]]])
tensor(2.1309)
tensor(-1.9895) 

第二,transforms.Normalize(mean,std)适用于input[channel] = (input[channel] - mean[channel]) / std[channel],因此根据我们提供的均值和标准差,我们无法在(0,1)范围内进行变换后得到值。如果您想要介于(-1,1)之间的值,则可以使用以下-

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), 
                  tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

希望对您有所帮助! :)

答案 1 :(得分:0)

看起来像在不进行归一化的情况下进行读取并本身转换为张量时,它们会在0到1的范围内自动归一化。当我们应用Normalization时,它将应用您在此数据上提到的公式,范围为0到1。以下是修改后的工作代码,其中包含一些打印语句,这些语句显示何时调用Normalize类中的'__call__'函数,以及值的显示方式标准化后的值是0.2314。用0.5归一化使其为(0.2314-0.5)/0.5 = -0.5372。张量值的第一次打印和第二次打印显示了这一点。

代码

import torchvision as tv
import numpy as np
import torch.utils.data as data

dataDir         = 'D:\\general\\ML_DL\\datasets\\CIFAR'

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor()])

trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach1 Step1 done')
dataloader      = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach1 Step2 done')
images, labels  = iter(dataloader).next()
print ('Approach1 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())

#trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach2 Step1 done')
dataloader      = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach2 Step2 done')
images, labels  = iter(dataloader).next()
print ('Approach2 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())

上面代码的输出是

Approach1 Step1 done
Approach1 Step2 done
Approach1 Step3 done
tensor([[0.2314, 0.1686, 0.1961,  ..., 0.6196, 0.5961, 0.5804],
        [0.0627, 0.0000, 0.0706,  ..., 0.4824, 0.4667, 0.4784],
        [0.0980, 0.0627, 0.1922,  ..., 0.4627, 0.4706, 0.4275],
        ...,
        [0.8157, 0.7882, 0.7765,  ..., 0.6275, 0.2196, 0.2078],
        [0.7059, 0.6784, 0.7294,  ..., 0.7216, 0.3804, 0.3255],
        [0.6941, 0.6588, 0.7020,  ..., 0.8471, 0.5922, 0.4824]])
tensor(1.)
tensor(0.)
tensor(0.4057)
Approach2 Step1 done
Approach2 Step2 done
__call__ inside Normalization is called
Approach2 Step3 done
tensor([[-0.5373, -0.6627, -0.6078,  ...,  0.2392,  0.1922,  0.1608],
        [-0.8745, -1.0000, -0.8588,  ..., -0.0353, -0.0667, -0.0431],
        [-0.8039, -0.8745, -0.6157,  ..., -0.0745, -0.0588, -0.1451],
        ...,
        [ 0.6314,  0.5765,  0.5529,  ...,  0.2549, -0.5608, -0.5843],
        [ 0.4118,  0.3569,  0.4588,  ...,  0.4431, -0.2392, -0.3490],
        [ 0.3882,  0.3176,  0.4039,  ...,  0.6941,  0.1843, -0.0353]])
tensor(1.)
tensor(-1.)
tensor(-0.1886)