我想知道,我是否正确地使用了来自torchvision的toPILImage。我想使用它来查看初始图像转换应用于数据集后图像的外观。
当我在下面的代码中使用它时,出现的图像有奇怪的颜色,如this one。原始图像是常规RGB图像。
这是我的代码:
import os
import torch
from PIL import Image, ImageFont, ImageDraw
import torch.utils.data as data
import torchvision
from torchvision import transforms
import matplotlib.pyplot as plt
# Image transformations
normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
transform_img = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(256),
transforms.ToTensor(),
normalize ])
train_data = torchvision.datasets.ImageFolder(
root='./train_cl/',
transform=transform_img
)
test_data = torchvision.datasets.ImageFolder(
root='./test_named_cl/',
transform=transform_img
)
train_data_loader = data.DataLoader(train_data,
batch_size=4,
shuffle=True,
num_workers=4) #num_workers=args.nThreads)
test_data_loader = data.DataLoader(test_data,
batch_size=32,
shuffle=False,
num_workers=4)
# Open Image from dataset:
to_pil_image = transforms.ToPILImage()
my_img, _ = train_data[248]
results = to_pil_image(my_img)
results.show()
编辑:
我不得不在火炬变量上使用.data来获得张量。 我还需要在转置之前重新调整numpy数组。我找到了一个有效的解决方案here,但它并不总能奏效。我怎样才能更好地做到这一点?
for i, data in enumerate(train_data_loader, 0):
img, labels = data
img = Variable(img)
break
image = img.data.cpu().numpy()[0]
# This worked for rescaling:
image = (1/(2*2.25)) * image + 0.5
# Both of these didn't work:
# image /= (image.max()/255.0)
# image *= (255.0/image.max())
image = np.transpose(image, (1,2,0))
plt.imshow(image)
plt.show()
答案 0 :(得分:3)
我会用这样的东西
# Open Image from dataset:
my_img, _ = train_data[248]
results = transforms.ToPILImage()(my_img)
results.show()
答案 1 :(得分:2)
您可以使用PIL图像,但实际上并没有像往常一样加载数据。
尝试这样的事情:
import numpy as np
import matplotlib.pyplot as plt
for img,labels in train_data_loader:
# load a batch from train data
break
# this converts it from GPU to CPU and selects first image
img = img.cpu().numpy()[0]
#convert image back to Height,Width,Channels
img = np.transpose(img, (1,2,0))
#show the image
plt.imshow(img)
plt.show()