我正尝试训练MaskRCNN使用来自这个paper的数据集来检测和分割苹果
github link to code being used
我只是按照自述文件中提供的说明进行操作。
这是控制台上的输出
(venv) PS > python train_rcnn.py --data_path 'D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\datasets\apples-minneapple' --model mrcnn --epochs 50 --output-dir 'D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples'
mrcnn
Namespace(batch_size=2, data_path='D:\\Research Report\\tensorflow\\Mask_RCNN-TRIALS\\Mask_RCNN-master\\datasets\\apples-minneapple', dataset='AppleDataset', device='cuda', epochs=50, lr=0.02, lr_gamma=0.1, lr_step_size=8, lr_steps=[8, 11], model='mrcnn', momentum=0.9, output_dir='D:\\Research Report\\tensorflow\\Mask_RCNN-TRIALS\\Mask_RCNN-master\\samples\\apples', print_freq=20, resume='', weight_decay=0.0001, workers=4)
Loading data
Creating data loaders
Creating model
Start training
Epoch: [0] [ 0/335] eta: 1:00:28 lr: 0.000080 loss: 2.4100 (2.4100) loss_classifier: 0.8481 (0.8481) loss_box_reg: 0.4164 (0.4164) loss_objectness: 0.9299 (0.9299) loss_rpn_box_reg: 0.2157 (0.2157) time: 10.8327 data: 7.9925 max mem: 2733
Epoch: [0] [ 20/335] eta: 0:06:18 lr: 0.001276 loss: 1.4465 (1.4728) loss_classifier: 0.5526 (0.5496) loss_box_reg: 0.3586 (0.3572) loss_objectness: 0.2666 (0.3418) loss_rpn_box_reg: 0.2233 (0.2242) time: 0.7204 data: 0.0132 max mem: 3247
Epoch: [0] [ 40/335] eta: 0:04:48 lr: 0.002473 loss: 0.9622 (1.2287) loss_classifier: 0.2927 (0.4276) loss_box_reg: 0.3188 (0.3314) loss_objectness: 0.1422 (0.2491) loss_rpn_box_reg: 0.2168 (0.2207) time: 0.7408 data: 0.0210 max mem: 3282
Epoch: [0] [ 60/335] eta: 0:04:05 lr: 0.003669 loss: 0.7924 (1.0887) loss_classifier: 0.2435 (0.3654) loss_box_reg: 0.2361 (0.2983) loss_objectness: 0.1289 (0.2105) loss_rpn_box_reg: 0.1898 (0.2144) time: 0.7244 data: 0.0127 max mem: 3432
Epoch: [0] [ 80/335] eta: 0:03:37 lr: 0.004865 loss: 0.7438 (1.0117) loss_classifier: 0.2565 (0.3376) loss_box_reg: 0.2193 (0.2799) loss_objectness: 0.0776 (0.1835) loss_rpn_box_reg: 0.1983 (0.2108) time: 0.7217 data: 0.0127 max mem: 3432
Epoch: [0] [100/335] eta: 0:03:14 lr: 0.006062 loss: 0.7373 (0.9490) loss_classifier: 0.2274 (0.3156) loss_box_reg: 0.2193 (0.2654) loss_objectness: 0.0757 (0.1643) loss_rpn_box_reg: 0.1867 (0.2037) time: 0.7291 data: 0.0132 max mem: 3432
Epoch: [0] [120/335] eta: 0:02:54 lr: 0.007258 loss: 0.8275 (0.9243) loss_classifier: 0.2689 (0.3094) loss_box_reg: 0.2315 (0.2602) loss_objectness: 0.0867 (0.1539) loss_rpn_box_reg: 0.1883 (0.2008) time: 0.7270 data: 0.0134 max mem: 3432
Epoch: [0] [140/335] eta: 0:02:35 lr: 0.008455 loss: 0.7886 (0.9057) loss_classifier: 0.2573 (0.3029) loss_box_reg: 0.2246 (0.2539) loss_objectness: 0.0724 (0.1455) loss_rpn_box_reg: 0.2459 (0.2035) time: 0.7170 data: 0.0124 max mem: 3432
Epoch: [0] [160/335] eta: 0:02:17 lr: 0.009651 loss: 0.7588 (0.8878) loss_classifier: 0.2341 (0.2948) loss_box_reg: 0.2226 (0.2486) loss_objectness: 0.1032 (0.1427) loss_rpn_box_reg: 0.2020 (0.2016) time: 0.7139 data: 0.0118 max mem: 3432
Epoch: [0] [180/335] eta: 0:02:01 lr: 0.010847 loss: 0.7340 (0.8744) loss_classifier: 0.2331 (0.2898) loss_box_reg: 0.2120 (0.2441) loss_objectness: 0.1086 (0.1392) loss_rpn_box_reg: 0.1993 (0.2012) time: 0.7800 data: 0.0584 max mem: 3432
Epoch: [0] [200/335] eta: 0:01:45 lr: 0.012044 loss: 0.8106 (0.8694) loss_classifier: 0.2616 (0.2873) loss_box_reg: 0.2208 (0.2411) loss_objectness: 0.1117 (0.1397) loss_rpn_box_reg: 0.1927 (0.2014) time: 0.7344 data: 0.0143 max mem: 3432
Epoch: [0] [220/335] eta: 0:01:29 lr: 0.013240 loss: 0.8191 (0.8610) loss_classifier: 0.2581 (0.2848) loss_box_reg: 0.2140 (0.2382) loss_objectness: 0.0860 (0.1362) loss_rpn_box_reg: 0.2177 (0.2018) time: 0.7213 data: 0.0126 max mem: 3432
Epoch: [0] [240/335] eta: 0:01:13 lr: 0.014437 loss: 0.7890 (0.8590) loss_classifier: 0.2671 (0.2842) loss_box_reg: 0.2094 (0.2357) loss_objectness: 0.1175 (0.1360) loss_rpn_box_reg: 0.2256 (0.2030) time: 0.7576 data: 0.0564 max mem: 3432
Epoch: [0] [260/335] eta: 0:00:57 lr: 0.015633 loss: 0.8631 (0.8587) loss_classifier: 0.2900 (0.2849) loss_box_reg: 0.2089 (0.2337) loss_objectness: 0.0925 (0.1350) loss_rpn_box_reg: 0.2271 (0.2050) time: 0.7371 data: 0.0220 max mem: 3432
Epoch: [0] [280/335] eta: 0:00:42 lr: 0.016830 loss: 0.8464 (0.8580) loss_classifier: 0.2679 (0.2840) loss_box_reg: 0.2156 (0.2321) loss_objectness: 0.0940 (0.1346) loss_rpn_box_reg: 0.2345 (0.2073) time: 0.7379 data: 0.0143 max mem: 3432
Epoch: [0] [300/335] eta: 0:00:27 lr: 0.018026 loss: 0.7991 (0.8519) loss_classifier: 0.2485 (0.2819) loss_box_reg: 0.2125 (0.2305) loss_objectness: 0.0819 (0.1315) loss_rpn_box_reg: 0.2217 (0.2080) time: 0.8549 data: 0.1419 max mem: 3450
Epoch: [0] [320/335] eta: 0:00:11 lr: 0.019222 loss: 0.6906 (0.8432) loss_classifier: 0.2362 (0.2791) loss_box_reg: 0.2036 (0.2285) loss_objectness: 0.0662 (0.1285) loss_rpn_box_reg: 0.1801 (0.2070) time: 0.7257 data: 0.0238 max mem: 3450
Epoch: [0] [334/335] eta: 0:00:00 lr: 0.020000 loss: 0.7822 (0.8441) loss_classifier: 0.2501 (0.2785) loss_box_reg: 0.2224 (0.2285) loss_objectness: 0.1135 (0.1296) loss_rpn_box_reg: 0.1948 (0.2075) time: 0.7249 data: 0.0139 max mem: 3450
Epoch: [0] Total time: 0:04:18 (0.7707 s / it)
Traceback (most recent call last):
File "train_rcnn.py", line 143, in <module>
main(args)
File "train_rcnn.py", line 109, in main
evaluate(model, data_loader_test, device=device)
File "C:\Users\___\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\autograd\grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\engine.py", line 78, in evaluate
coco = get_coco_api_from_dataset(data_loader.dataset)
File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\coco_utils.py", line 205, in get_coco_api_from_dataset
return convert_to_coco_api(dataset)
File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\utility\coco_utils.py", line 154, in convert_to_coco_api
img, targets = ds[img_idx]
File "D:\Research Report\tensorflow\Mask_RCNN-TRIALS\Mask_RCNN-master\samples\apples\data\apple_dataset.py", line 22, in __getitem__
mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])
IndexError: list index out of range
这是为了训练网络而运行的文件
import datetime
import os
import time
import torch
import torch.utils.data
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor
from data.apple_dataset import AppleDataset
from utility.engine import train_one_epoch, evaluate
import utility.utils as utils
import utility.transforms as T
######################################################
# Train either a Faster-RCNN or Mask-RCNN predictor
# using the MinneApple dataset
######################################################
def get_transform(train):
transforms = []
transforms.append(T.ToTensor())
if train:
transforms.append(T.RandomHorizontalFlip(0.5))
return T.Compose(transforms)
def get_maskrcnn_model_instance(num_classes):
# load an instance segmentation model pre-trained pre-trained on COCO
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
# now get the number of input features for the mask classifier
in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
hidden_layer = 256
# and replace the mask predictor with a new one
model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask, hidden_layer, num_classes)
return model
def get_frcnn_model_instance(num_classes):
# load an instance segmentation model pre-trained pre-trained on COCO
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
return model
def main(args):
print(args)
device = args.device
# Data loading code
print("Loading data")
num_classes = 2
dataset = AppleDataset(os.path.join(args.data_path, 'train'), get_transform(train=True))
dataset_test = AppleDataset(os.path.join(args.data_path, 'test'), get_transform(train=False))
print("Creating data loaders")
data_loader = torch.utils.data.DataLoader(dataset, batch_size=args.batch_size, shuffle=True,
num_workers=args.workers, collate_fn=utils.collate_fn)
data_loader_test = torch.utils.data.DataLoader(dataset_test, batch_size=1,
shuffle=False, num_workers=args.workers,
collate_fn=utils.collate_fn)
print("Creating model")
# Create the correct model type
if args.model == 'maskrcnn':
model = get_maskrcnn_model_instance(num_classes)
else:
model = get_frcnn_model_instance(num_classes)
# Move model to the right device
model.to(device)
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay)
# lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=args.lr_step_size, gamma=args.lr_gamma)
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=args.lr_steps, gamma=args.lr_gamma)
if args.resume:
checkpoint = torch.load(args.resume, map_location='cpu')
model.load_state_dict(checkpoint['model'])
optimizer.load_state_dict(checkpoint['optimizer'])
lr_scheduler.load_state_dict(checkpoint['lr_scheduler'])
print("Start training")
start_time = time.time()
for epoch in range(args.epochs):
train_one_epoch(model, optimizer, data_loader, device, epoch, args.print_freq)
lr_scheduler.step()
if args.output_dir:
torch.save(model.state_dict(), os.path.join(args.output_dir, 'model_{}.pth'.format(epoch)))
# evaluate after every epoch
evaluate(model, data_loader_test, device=device)
total_time = time.time() - start_time
total_time_str = str(datetime.timedelta(seconds=int(total_time)))
print('Training time {}'.format(total_time_str))
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description='PyTorch Detection Training')
parser.add_argument('--data_path', default='~~~~', help='dataset')
parser.add_argument('--dataset', default='AppleDataset', help='dataset')
parser.add_argument('--model', default='maskrcnn', help='model')
parser.add_argument('--device', default='cuda', help='device')
parser.add_argument('-b', '--batch-size', default=2, type=int)
parser.add_argument('--epochs', default=13, type=int, metavar='N', help='number of total epochs to run')
parser.add_argument('-j', '--workers', default=4, type=int, metavar='N', help='number of data loading workers (default: 16)')
parser.add_argument('--lr', default=0.02, type=float, help='initial learning rate')
parser.add_argument('--momentum', default=0.9, type=float, metavar='M', help='momentum')
parser.add_argument('--wd', '--weight-decay', default=1e-4, type=float, metavar='W', help='weight decay (default: 1e-4)', dest='weight_decay')
parser.add_argument('--lr-step-size', default=8, type=int, help='decrease lr every step-size epochs')
parser.add_argument('--lr-steps', default=[8, 11], nargs='+', type=int, help='decrease lr every step-size epochs')
parser.add_argument('--lr-gamma', default=0.1, type=float, help='decrease lr by a factor of lr-gamma')
parser.add_argument('--print-freq', default=20, type=int, help='print frequency')
parser.add_argument('--output-dir', default='.', help='path where to save')
parser.add_argument('--resume', default='', help='resume from checkpoint')
args = parser.parse_args()
print(args.model)
assert(args.model in ['mrcnn', 'frcnn'])
if args.output_dir:
utils.mkdir(args.output_dir)
main(args)
Apple_dataset.py如下
import os
import numpy as np
import torch
from PIL import Image
#####################################
# Class that takes the input instance masks
# and extracts bounding boxes on the fly
#####################################
class AppleDataset(object):
def __init__(self, root_dir, transforms):
self.root_dir = root_dir
self.transforms = transforms
# Load all image and mask files, sorting them to ensure they are aligned
self.imgs = list(sorted(os.listdir(os.path.join(root_dir, "images"))))
self.masks = list(sorted(os.listdir(os.path.join(root_dir, "masks"))))
def __getitem__(self, idx):
# Load images and masks
img_path = os.path.join(self.root_dir, "images", self.imgs[idx])
mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])
img = Image.open(img_path).convert("RGB")
mask = Image.open(mask_path) # Each color of mask corresponds to a different instance with 0 being the background
# Convert the PIL image to np array
mask = np.array(mask)
obj_ids = np.unique(mask)
# Remove background id
obj_ids = obj_ids[1:]
# Split the color-encoded masks into a set of binary masks
masks = mask == obj_ids[:, None, None]
# Get bbox coordinates for each mask
num_objs = len(obj_ids)
boxes = []
h, w = mask.shape
for ii in range(num_objs):
pos = np.where(masks[ii])
xmin = np.min(pos[1])
xmax = np.max(pos[1])
ymin = np.min(pos[0])
ymax = np.max(pos[0])
if xmin == xmax or ymin == ymax:
continue
xmin = np.clip(xmin, a_min=0, a_max=w)
xmax = np.clip(xmax, a_min=0, a_max=w)
ymin = np.clip(ymin, a_min=0, a_max=h)
ymax = np.clip(ymax, a_min=0, a_max=h)
boxes.append([xmin, ymin, xmax, ymax])
# Convert everything into a torch.Tensor
boxes = torch.as_tensor(boxes, dtype=torch.float32)
# There is only one class (apples)
labels = torch.ones((num_objs,), dtype=torch.int64)
masks = torch.as_tensor(masks, dtype=torch.uint8)
image_id = torch.tensor([idx])
area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
# All instances are not crowd
iscrowd = torch.zeros((num_objs,), dtype=torch.int64)
target = {}
target["boxes"] = boxes
target["labels"] = labels
target["masks"] = masks
target["image_id"] = image_id
target["area"] = area
target["iscrowd"] = iscrowd
if self.transforms is not None:
img, target = self.transforms(img, target)
return img, target
def __len__(self):
return len(self.imgs)
def get_img_name(self, idx):
return self.imgs[idx]
我该如何解决索引超出范围的问题???还是这里需要解决的潜在问题是什么?
EDIT1:好..所以这里发生的是我有两个文件夹“ train”和“ test” .. training文件夹包含图像和遮罩,而test文件夹仅包含图像.. apple_dataset.py编写为它在训练和测试文件夹中都寻找masks文件夹。.我想我需要更改代码,使其仅在火车文件夹中而不是测试集中寻找
答案 0 :(得分:0)
错误告诉您您要尝试从列表self.masks中访问索引的所有操作,这些都不存在
问题在此行mask_path = os.path.join(self.root_dir, "masks", self.masks[idx])
中。您仅在每次传递idx时都需要检查其值,这样您才能找出问题所在
答案 1 :(得分:0)
通过在“ test”文件夹中创建一个名为“ masks”的虚拟文件夹来解决此问题。只需复制“ train”中的所有文件夹并将其粘贴到所有蒙版中即可。.train和预测脚本不会真正使用它,因此应该这里有任何问题。
还查看this问题,了解需要进行的更多更改