Question

我正在 PyTorch 中尝试在 UCF11 数据集上训练模型。

我正在尝试对视频数据样本进行随机增强。

self.transforms_list = [transforms.ColorJitter(brightness=0.225),
                            transforms.ColorJitter(contrast=0.225),
                            transforms.ColorJitter(saturation=0.1),
                            transforms.ColorJitter(hue=0.1),
                            transforms.Grayscale(num_output_channels=3),
                            transforms.RandomHorizontalFlip(p=0.25),
                            transforms.RandomRotation(30.0),
                            transforms.GaussianBlur((5,5), sigma=(0.1, 2.0))]

transformations = random.sample(self.transforms_list,random.randint(1,8))+[transforms.RandomCrop(224)]
transformations = transforms.Compose(transformations)

但是当我尝试对视频剪辑的每一帧应用这些转换时，我收到此错误

RuntimeError: The expanded size of the tensor (320) must match the existing size (224) at non-singleton dimension 2.  
Target sizes: [3, 240, 320].  Tensor sizes: [3, 224, 224]

无法理解为什么会发生此错误。我检查了源代码，但我看不到出现此错误的原因。

我为 RandomCrop 创建了一个自定义变换类，但仍然出现相同的错误。

请帮忙！！

编辑：

由于输入是视频，因此输入具有形状 (T,C,H,W)

我像这样对视频中的每一帧应用变换

vid[i] = transformations(vid[i])

创建了一个新变量 new_vid = [] 并将代码更改为

new_vid.append(transformations(vid[i]))

最后， vid = torch.cat(new_vid,dim = 0)

我得到了一个形状为 (T,3,224,224) 的视频

运行时错误：张量 (320) 的扩展大小必须与非单维 2 处的现有大小 (224) 匹配

0 个答案: