训练数据集时索引超出范围错误

时间:2020-03-10 23:26:42

标签: python machine-learning computer-vision pytorch conv-neural-network

我正尝试训练MaskRCNN使用来自这个paper的数据集来检测和分割苹果

github link to code being used

我只是按照自述文件中提供的说明进行操作。

这是控制台上的输出

(venv) PS > python train_rcnn.py --data_path 'D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\datasets\apples-minneapple' --model mrcnn --epochs 50 --output-dir 'D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples'

mrcnn
Namespace(batch_size=2, data_path='D:\\Research Report\\tensorflow\\Mask_RCNN-TRIALS\\Mask_RCNN-master\\datasets\\apples-minneapple', dataset='AppleDataset', device='cuda', epochs=50, lr=0.02, lr_gamma=0.1, lr_step_size=8, lr_steps=[8, 11], model='mrcnn', momentum=0.9, output_dir='D:\\Research Report\\tensorflow\\Mask_RCNN-TRIALS\\Mask_RCNN-master\\samples\\apples', print_freq=20, resume='', weight_decay=0.0001, workers=4)
Loading data
Creating data loaders
Creating model
Start training
Epoch: [0]  [  0/335]  eta: 1:00:28  lr: 0.000080  loss: 2.4100 (2.4100)  loss_classifier: 0.8481 (0.8481)  loss_box_reg: 0.4164 (0.4164)  loss_objectness: 0.9299 (0.9299)  loss_rpn_box_reg: 0.2157 (0.2157)  time: 10.8327  data: 7.9925  max mem: 2733
Epoch: [0]  [ 20/335]  eta: 0:06:18  lr: 0.001276  loss: 1.4465 (1.4728)  loss_classifier: 0.5526 (0.5496)  loss_box_reg: 0.3586 (0.3572)  loss_objectness: 0.2666 (0.3418)  loss_rpn_box_reg: 0.2233 (0.2242)  time: 0.7204  data: 0.0132  max mem: 3247
Epoch: [0]  [ 40/335]  eta: 0:04:48  lr: 0.002473  loss: 0.9622 (1.2287)  loss_classifier: 0.2927 (0.4276)  loss_box_reg: 0.3188 (0.3314)  loss_objectness: 0.1422 (0.2491)  loss_rpn_box_reg: 0.2168 (0.2207)  time: 0.7408  data: 0.0210  max mem: 3282
Epoch: [0]  [ 60/335]  eta: 0:04:05  lr: 0.003669  loss: 0.7924 (1.0887)  loss_classifier: 0.2435 (0.3654)  loss_box_reg: 0.2361 (0.2983)  loss_objectness: 0.1289 (0.2105)  loss_rpn_box_reg: 0.1898 (0.2144)  time: 0.7244  data: 0.0127  max mem: 3432
Epoch: [0]  [ 80/335]  eta: 0:03:37  lr: 0.004865  loss: 0.7438 (1.0117)  loss_classifier: 0.2565 (0.3376)  loss_box_reg: 0.2193 (0.2799)  loss_objectness: 0.0776 (0.1835)  loss_rpn_box_reg: 0.1983 (0.2108)  time: 0.7217  data: 0.0127  max mem: 3432
Epoch: [0]  [100/335]  eta: 0:03:14  lr: 0.006062  loss: 0.7373 (0.9490)  loss_classifier: 0.2274 (0.3156)  loss_box_reg: 0.2193 (0.2654)  loss_objectness: 0.0757 (0.1643)  loss_rpn_box_reg: 0.1867 (0.2037)  time: 0.7291  data: 0.0132  max mem: 3432
Epoch: [0]  [120/335]  eta: 0:02:54  lr: 0.007258  loss: 0.8275 (0.9243)  loss_classifier: 0.2689 (0.3094)  loss_box_reg: 0.2315 (0.2602)  loss_objectness: 0.0867 (0.1539)  loss_rpn_box_reg: 0.1883 (0.2008)  time: 0.7270  data: 0.0134  max mem: 3432
Epoch: [0]  [140/335]  eta: 0:02:35  lr: 0.008455  loss: 0.7886 (0.9057)  loss_classifier: 0.2573 (0.3029)  loss_box_reg: 0.2246 (0.2539)  loss_objectness: 0.0724 (0.1455)  loss_rpn_box_reg: 0.2459 (0.2035)  time: 0.7170  data: 0.0124  max mem: 3432
Epoch: [0]  [160/335]  eta: 0:02:17  lr: 0.009651  loss: 0.7588 (0.8878)  loss_classifier: 0.2341 (0.2948)  loss_box_reg: 0.2226 (0.2486)  loss_objectness: 0.1032 (0.1427)  loss_rpn_box_reg: 0.2020 (0.2016)  time: 0.7139  data: 0.0118  max mem: 3432
Epoch: [0]  [180/335]  eta: 0:02:01  lr: 0.010847  loss: 0.7340 (0.8744)  loss_classifier: 0.2331 (0.2898)  loss_box_reg: 0.2120 (0.2441)  loss_objectness: 0.1086 (0.1392)  loss_rpn_box_reg: 0.1993 (0.2012)  time: 0.7800  data: 0.0584  max mem: 3432
Epoch: [0]  [200/335]  eta: 0:01:45  lr: 0.012044  loss: 0.8106 (0.8694)  loss_classifier: 0.2616 (0.2873)  loss_box_reg: 0.2208 (0.2411)  loss_objectness: 0.1117 (0.1397)  loss_rpn_box_reg: 0.1927 (0.2014)  time: 0.7344  data: 0.0143  max mem: 3432
Epoch: [0]  [220/335]  eta: 0:01:29  lr: 0.013240  loss: 0.8191 (0.8610)  loss_classifier: 0.2581 (0.2848)  loss_box_reg: 0.2140 (0.2382)  loss_objectness: 0.0860 (0.1362)  loss_rpn_box_reg: 0.2177 (0.2018)  time: 0.7213  data: 0.0126  max mem: 3432
Epoch: [0]  [240/335]  eta: 0:01:13  lr: 0.014437  loss: 0.7890 (0.8590)  loss_classifier: 0.2671 (0.2842)  loss_box_reg: 0.2094 (0.2357)  loss_objectness: 0.1175 (0.1360)  loss_rpn_box_reg: 0.2256 (0.2030)  time: 0.7576  data: 0.0564  max mem: 3432
Epoch: [0]  [260/335]  eta: 0:00:57  lr: 0.015633  loss: 0.8631 (0.8587)  loss_classifier: 0.2900 (0.2849)  loss_box_reg: 0.2089 (0.2337)  loss_objectness: 0.0925 (0.1350)  loss_rpn_box_reg: 0.2271 (0.2050)  time: 0.7371  data: 0.0220  max mem: 3432
Epoch: [0]  [280/335]  eta: 0:00:42  lr: 0.016830  loss: 0.8464 (0.8580)  loss_classifier: 0.2679 (0.2840)  loss_box_reg: 0.2156 (0.2321)  loss_objectness: 0.0940 (0.1346)  loss_rpn_box_reg: 0.2345 (0.2073)  time: 0.7379  data: 0.0143  max mem: 3432
Epoch: [0]  [300/335]  eta: 0:00:27  lr: 0.018026  loss: 0.7991 (0.8519)  loss_classifier: 0.2485 (0.2819)  loss_box_reg: 0.2125 (0.2305)  loss_objectness: 0.0819 (0.1315)  loss_rpn_box_reg: 0.2217 (0.2080)  time: 0.8549  data: 0.1419  max mem: 3450
Epoch: [0]  [320/335]  eta: 0:00:11  lr: 0.019222  loss: 0.6906 (0.8432)  loss_classifier: 0.2362 (0.2791)  loss_box_reg: 0.2036 (0.2285)  loss_objectness: 0.0662 (0.1285)  loss_rpn_box_reg: 0.1801 (0.2070)  time: 0.7257  data: 0.0238  max mem: 3450
Epoch: [0]  [334/335]  eta: 0:00:00  lr: 0.020000  loss: 0.7822 (0.8441)  loss_classifier: 0.2501 (0.2785)  loss_box_reg: 0.2224 (0.2285)  loss_objectness: 0.1135 (0.1296)  loss_rpn_box_reg: 0.1948 (0.2075)  time: 0.7249  data: 0.0139  max mem: 3450
Epoch: [0] Total time: 0:04:18 (0.7707 s / it)
Traceback (most recent call last):
  File "train_rcnn.py", line 143, in <module>
    main(args)
  File "train_rcnn.py", line 109, in main
    evaluate(model, data_loader_test, device=device)
  File "C:\Users\___\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\autograd\grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\engine.py", line 78, in evaluate
    coco = get_coco_api_from_dataset(data_loader.dataset)
  File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\coco_utils.py", line 205, in get_coco_api_from_dataset
    return convert_to_coco_api(dataset)
  File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\coco_utils.py", line 154, in convert_to_coco_api
    img, targets = ds[img_idx]
  File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\data\apple_dataset.py", line 22, in __getitem__
    mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])
IndexError: list index out of range

这是为了训练网络而运行的文件

import datetime
import os
import time

import torch
import torch.utils.data
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor

from data.apple_dataset import AppleDataset
from utility.engine import train_one_epoch, evaluate

import utility.utils as utils
import utility.transforms as T

######################################################
# Train either a Faster-RCNN or Mask-RCNN predictor
# using the MinneApple dataset
######################################################


def get_transform(train):
    transforms = []
    transforms.append(T.ToTensor())
    if train:
        transforms.append(T.RandomHorizontalFlip(0.5))
    return T.Compose(transforms)


def get_maskrcnn_model_instance(num_classes):
    # load an instance segmentation model pre-trained pre-trained on COCO
    model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)

    # get number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

    # now get the number of input features for the mask classifier
    in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
    hidden_layer = 256
    # and replace the mask predictor with a new one
    model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask, hidden_layer, num_classes)
    return model


def get_frcnn_model_instance(num_classes):
    # load an instance segmentation model pre-trained pre-trained on COCO
    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

    # get number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
    return model


def main(args):
    print(args)
    device = args.device

    # Data loading code
    print("Loading data")
    num_classes = 2
    dataset = AppleDataset(os.path.join(args.data_path, 'train'), get_transform(train=True))
    dataset_test = AppleDataset(os.path.join(args.data_path, 'test'), get_transform(train=False))

    print("Creating data loaders")
    data_loader = torch.utils.data.DataLoader(dataset, batch_size=args.batch_size, shuffle=True,
                                              num_workers=args.workers, collate_fn=utils.collate_fn)

    data_loader_test = torch.utils.data.DataLoader(dataset_test, batch_size=1,
                                                   shuffle=False, num_workers=args.workers,
                                                   collate_fn=utils.collate_fn)

    print("Creating model")
    # Create the correct model type
    if args.model == 'maskrcnn':
        model = get_maskrcnn_model_instance(num_classes)
    else:
        model = get_frcnn_model_instance(num_classes)

    # Move model to the right device
    model.to(device)

    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = torch.optim.SGD(params, lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay)

    #  lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=args.lr_step_size, gamma=args.lr_gamma)
    lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=args.lr_steps, gamma=args.lr_gamma)

    if args.resume:
        checkpoint = torch.load(args.resume, map_location='cpu')
        model.load_state_dict(checkpoint['model'])
        optimizer.load_state_dict(checkpoint['optimizer'])
        lr_scheduler.load_state_dict(checkpoint['lr_scheduler'])

    print("Start training")
    start_time = time.time()
    for epoch in range(args.epochs):
        train_one_epoch(model, optimizer, data_loader, device, epoch, args.print_freq)
        lr_scheduler.step()

        if args.output_dir:
            torch.save(model.state_dict(), os.path.join(args.output_dir, 'model_{}.pth'.format(epoch)))

        # evaluate after every epoch
        evaluate(model, data_loader_test, device=device)

    total_time = time.time() - start_time
    total_time_str = str(datetime.timedelta(seconds=int(total_time)))
    print('Training time {}'.format(total_time_str))


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description='PyTorch Detection Training')
    parser.add_argument('--data_path', default='~~~~', help='dataset')
    parser.add_argument('--dataset', default='AppleDataset', help='dataset')
    parser.add_argument('--model', default='maskrcnn', help='model')
    parser.add_argument('--device', default='cuda', help='device')
    parser.add_argument('-b', '--batch-size', default=2, type=int)
    parser.add_argument('--epochs', default=13, type=int, metavar='N', help='number of total epochs to run')
    parser.add_argument('-j', '--workers', default=4, type=int, metavar='N', help='number of data loading workers (default: 16)')
    parser.add_argument('--lr', default=0.02, type=float, help='initial learning rate')
    parser.add_argument('--momentum', default=0.9, type=float, metavar='M', help='momentum')
    parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float, metavar='W', help='weight decay (default: 1e-4)', dest='weight_decay')
    parser.add_argument('--lr-step-size', default=8, type=int, help='decrease lr every step-size epochs')
    parser.add_argument('--lr-steps', default=[8, 11], nargs='+', type=int, help='decrease lr every step-size epochs')
    parser.add_argument('--lr-gamma', default=0.1, type=float, help='decrease lr by a factor of lr-gamma')
    parser.add_argument('--print-freq', default=20, type=int, help='print frequency')
    parser.add_argument('--output-dir', default='.', help='path where to save')
    parser.add_argument('--resume', default='', help='resume from checkpoint')

    args = parser.parse_args()
    print(args.model)
    assert(args.model in ['mrcnn', 'frcnn'])

    if args.output_dir:
        utils.mkdir(args.output_dir)

    main(args)

Apple_dataset.py如下

import os
import numpy as np
import torch
from PIL import Image

#####################################
# Class that takes the input instance masks
# and extracts bounding boxes on the fly
#####################################
class AppleDataset(object):
    def __init__(self, root_dir, transforms):
        self.root_dir = root_dir
        self.transforms = transforms

        # Load all image and mask files, sorting them to ensure they are aligned
        self.imgs = list(sorted(os.listdir(os.path.join(root_dir, "images"))))
        self.masks = list(sorted(os.listdir(os.path.join(root_dir, "masks"))))

    def __getitem__(self, idx):
        # Load images and masks
        img_path = os.path.join(self.root_dir, "images", self.imgs[idx])
        mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])

        img = Image.open(img_path).convert("RGB")
        mask = Image.open(mask_path)     # Each color of mask corresponds to a different instance with 0 being the background

        # Convert the PIL image to np array
        mask = np.array(mask)
        obj_ids = np.unique(mask)

        # Remove background id
        obj_ids = obj_ids[1:]

        # Split the color-encoded masks into a set of binary masks
        masks = mask == obj_ids[:, None, None]

        # Get bbox coordinates for each mask
        num_objs = len(obj_ids)
        boxes = []
        h, w = mask.shape
        for ii in range(num_objs):
            pos = np.where(masks[ii])
            xmin = np.min(pos[1])
            xmax = np.max(pos[1])
            ymin = np.min(pos[0])
            ymax = np.max(pos[0])

            if xmin == xmax or ymin == ymax:
                continue

            xmin = np.clip(xmin, a_min=0, a_max=w)
            xmax = np.clip(xmax, a_min=0, a_max=w)
            ymin = np.clip(ymin, a_min=0, a_max=h)
            ymax = np.clip(ymax, a_min=0, a_max=h)
            boxes.append([xmin, ymin, xmax, ymax])

        # Convert everything into a torch.Tensor
        boxes = torch.as_tensor(boxes, dtype=torch.float32)

        # There is only one class (apples)
        labels = torch.ones((num_objs,), dtype=torch.int64)
        masks = torch.as_tensor(masks, dtype=torch.uint8)

        image_id = torch.tensor([idx])
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])

        # All instances are not crowd
        iscrowd = torch.zeros((num_objs,), dtype=torch.int64)

        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["masks"] = masks
        target["image_id"] = image_id
        target["area"] = area
        target["iscrowd"] = iscrowd

        if self.transforms is not None:
            img, target = self.transforms(img, target)

        return img, target

    def __len__(self):
        return len(self.imgs)

    def get_img_name(self, idx):
        return self.imgs[idx]

我该如何解决索引超出范围的问题???还是这里需要解决的潜在问题是什么?

EDIT1:好..所以这里发生的是我有两个文件夹“ train”和“ test” .. training文件夹包含图像和遮罩,而test文件夹仅包含图像.. apple_dataset.py编写为它在训练和测试文件夹中都寻找masks文件夹。.我想我需要更改代码,使其仅在火车文件夹中而不是测试集中寻找

2 个答案:

答案 0 :(得分:0)

错误告诉您您要尝试从列表self.masks中访问索引的所有操作,这些都不存在 问题在此行mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])中。您仅在每次传递idx时都需要检查其值,这样您才能找出问题所在

答案 1 :(得分:0)

通过在“ test”文件夹中创建一个名为“ masks”的虚拟文件夹来解决此问题。只需复制“ train”中的所有文件夹并将其粘贴到所有蒙版中即可。.train和预测脚本不会真正使用它,因此应该这里有任何问题。

还查看this问题,了解需要进行的更多更改