如何在自定义数据集上训练 Mask R-CNN 模型?

时间:2021-04-08 15:09:51

标签: keras deep-learning computer-vision conv-neural-network data-science

我正在使用 Matterport 的实现来训练迁移学习模型

我使用的数据格式是:

 {  
 'annotations':[ {'area': 843.3600000000006,
   'bbox': [62.36, 505.3, 60.24, 14.0],
   'category_id': 2,
   'id': 3453690,
   'image_id': 354610,
   'iscrowd': 0,
   'segmentation': [[62.36,
     505.3,
     122.6,
     505.3,
     122.6,
     519.3,
     62.36,
     519.3,
     62.36,
     505.3]]},]

 'categories': [{'id': 1, 'name': 'car', 'supercategory': ''},
  {'id': 2, 'name': 'bike', 'supercategory': ''},
  {'id': 3, 'name': 'chair', 'supercategory': ''},
  {'id': 4, 'name': 'pen', 'supercategory': ''},
  {'id': 5, 'name': 'table', 'supercategory': ''}],

 'images': [{'file_name': 'image1.jpg',
   'height': 794,
   'id': 348952,
   'width': 596},]
   }

我更改了Config类,但是我很难更改Dataset类中的load_mask方法,以便它可以通过Mask R-CNN网络并在分割、bbox预测和类预测上表现良好 这是代码:

os.chdir("./Mask_RCNN")

from mrcnn.config import Config
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize
from mrcnn.model import log
from mrcnn.utils import Dataset

with open('labels/samples.json', 'r') as fp:
    samples = json.load(fp)

class Modified(Dataset):

    # load the dataset definitions
    def load_dataset(self, dataset_dir, is_train=True):
        
        # Add classes. We have only one class to add.
        self.add_class("dataset", 1, "car")
        self.add_class("dataset", 2, "bike")
        self.add_class("dataset", 3, "chair")
        self.add_class("dataset", 4, "pen")
        self.add_class("dataset", 5, "table")
    
        # define data locations for images and annotations
        images_dir = dataset_dir + '/images/'
        annotations_dir = dataset_dir + '/labels/'
        
        # Iterate through all files in the folder to 
        #add class, images and annotaions
        for filename in os.listdir(images_dir):
            
            # extract image id

          for annotation in samples['images']:
              if annotation['file_name']==filename:
                print(annotation['file_name'])
                image_id = annotation['id']

                        # category_id=annotation['category_id']

        
              # setting image file
          img_path = images_dir + filename
          
          # setting annotations file
          ann_path = annotations_dir + '/samples.json'
          
          # adding images and annotations to dataset
          self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path)


# extract bounding boxes from an annotation file

    def extract_boxes(self, image_id):

        boxes = list()
        category_id = list()

        for annotation in samples['annotations']:
            if annotation['id']==image_id:
                boxes.append(annotation['bbox'])
                category_id.append(annotation['category_id'])


        # extract image dimensions
        for image in samples['images']:
            width=0
            height=0
            if image['id']==image_id:
                width=image['width']
                height=image['height']


        return boxes, width, height,category_id
# load the masks for an image
    """Generate instance masks for an image.
       Returns:
        masks: A bool array of shape [height, width, instance count] with
            one mask per instance.
        class_ids: a 1D array of class IDs of the instance masks.
     """

    def load_mask(self, image_id):
    
        boxes, w, h,category_id = self.extract_boxes(image_id)
  

        i=0
        
        for annotation in samples['annotations']:
            if annotation['image_id']==image_id:
              print("ok")
              points=annotation['segmentation'][0]
              all_points_y=[e for e in points if points.index(e)%2!=0]
              all_points_x=[e for e in points if points.index(e)%2==0]

              rr, cc = skimage.draw.polygon(all_points_y,all_points_x)
              masks[rr, cc, i] = 1
              class_ids.append(category_id[i])

              i=i+1

        return masks,asarray(class_ids, dtype='int32')

# load an image reference
    def image_reference(self, image_id):
        info = self.image_info[image_id]
        print(info)
        return info['path']

如何更改 load_mask 方法?它到底应该返回什么?以及代码使用哪种方法通过网络传递 bbox?

提前致谢

0 个答案:

没有答案