使用完全卷积网络扩充对象检测的小数据集

时间:2018-04-08 09:15:49

标签: python tensorflow keras object-detection

This Keras blog很好地解释了如何通过以下代码扩充小数据集:

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

我确信在博客中引入的vanilla示例效果很好,对于类似的简单场景。

在一个更复杂的场景中,我想使用在着名的COCO dataset for object detection上预训练的模型的权重来转移学习新类,为此我只有非常有限的数据量(< = 1000 )。

此类数据集中的标签粒度不是每个图像,而是图像内的每个对象。即,每个图像可以包含一个或多个由多边形边界框标记的对象,并且这些边界框根据它们包含的对象名称进行标记。这种复杂的标签信息以json格式编码,如下例所示:

{
"info": {
    "year": 2018,
    "version": null,
    "description": "Peaches",
    "contributor": "ralph@r4robotics.com.au",
    "url": "labelbox.io",
    "date_created": "2018-04-07T10:08:51.409340+00:00"
},
"images": [{
    "id": "cjfp6vz7xfwz20198ixce9la4",
    "width": 274,
    "height": 184,
    "file_name": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach8.jpg?alt=media&token=11337eaa-4ffd-4dfb-b3ec-9c4ee6bd2f17",
    "license": null,
    "flickr_url": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach8.jpg?alt=media&token=11337eaa-4ffd-4dfb-b3ec-9c4ee6bd2f17",
    "coco_url": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach8.jpg?alt=media&token=11337eaa-4ffd-4dfb-b3ec-9c4ee6bd2f17",
    "date_captured": null
}, {
    "id": "cjfp6wqfhfwyu0107il09db3p",
    "width": 275,
    "height": 183,
    "file_name": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach9.jpg?alt=media&token=39dd5e97-c411-43e9-9ba3-9f51a334c7c7",
    "license": null,
    "flickr_url": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach9.jpg?alt=media&token=39dd5e97-c411-43e9-9ba3-9f51a334c7c7",
    "coco_url": "https://firebasestorage.googleapis.com/v0/b/labelbox-193903.appspot.com/o/cjfp6hjghfuvd01147d130984%2F5a7fdf5d-201a-40d0-bfef-c36d6ed02212%2Fpeach9.jpg?alt=media&token=39dd5e97-c411-43e9-9ba3-9f51a334c7c7",
    "date_captured": null
}],
"annotations": [ {
    "id": 23,
    "image_id": "cjfp6vz7xfwz20198ixce9la4",
    "category_id": 1,
    "segmentation": [
        [31.0, 72.0, 63.0, 84.0, 75.0, 105.0, 67.0, 134.0, 68.0, 158.0, 44.0, 174.0, 24.0, 178.0, 2.0, 172.0, 2.0, 82.0, 31.0, 72.0]
    ],
    "area": 6301.0,
    "bbox": [2.0, 6.0, 73.0, 106.0],
    "iscrowd": 0
}, {
    "id": 24,
    "image_id": "cjfp6vz7xfwz20198ixce9la4",
    "category_id": 1,
    "segmentation": [
        [75.0, 103.0, 108.0, 76.0, 137.0, 74.0, 166.0, 89.0, 182.0, 104.0, 188.0, 145.0, 179.0, 171.0, 167.0, 183.0, 92.0, 183.0, 72.0, 158.0, 68.0, 134.0, 75.0, 103.0]
    ],
    "area": 10652.5,
    "bbox": [68.0, 1.0, 120.0, 109.0],
    "iscrowd": 0
}, {
    "id": 25,
    "image_id": "cjfp6vz7xfwz20198ixce9la4",
    "category_id": 1,
    "segmentation": [
        [169.0, 92.0, 182.0, 66.0, 211.0, 53.0, 246.0, 66.0, 262.0, 80.0, 268.0, 95.0, 261.0, 129.0, 241.0, 145.0, 216.0, 153.0, 188.0, 143.0, 184.0, 105.0, 169.0, 92.0]
    ],
    "area": 6838.5,
    "bbox": [169.0, 31.0, 99.0, 100.0],
    "iscrowd": 0
}, {
    "id": 26,
    "image_id": "cjfp6wqfhfwyu0107il09db3p",
    "category_id": 1,
    "segmentation": [
        [86.0, 54.0, 109.0, 56.0, 119.0, 73.0, 113.0, 92.0, 93.0, 101.0, 76.0, 92.0, 70.0, 77.0, 71.0, 63.0, 86.0, 54.0]
    ],
    "area": 1715.0,
    "bbox": [70.0, 82.0, 49.0, 47.0],
    "iscrowd": 0
}, {
    "id": 27,
    "image_id": "cjfp6wqfhfwyu0107il09db3p",
    "category_id": 1,
    "segmentation": [
        [117.0, 95.0, 123.0, 110.0, 136.0, 118.0, 153.0, 113.0, 159.0, 99.0, 158.0, 87.0, 145.0, 79.0, 132.0, 76.0, 123.0, 84.0, 117.0, 95.0]
    ],
    "area": 1260.0,
    "bbox": [117.0, 65.0, 42.0, 42.0],
    "iscrowd": 0
}, {
    "id": 28,
    "image_id": "cjfp6wqfhfwyu0107il09db3p",
    "category_id": 1,
    "segmentation": [
        [109.0, 54.0, 115.0, 40.0, 133.0, 32.0, 146.0, 34.0, 157.0, 43.0, 161.0, 58.0, 152.0, 72.0, 133.0, 76.0, 119.0, 71.0, 109.0, 54.0]
    ],
    "area": 1660.5,
    "bbox": [109.0, 107.0, 52.0, 44.0],
    "iscrowd": 0
}],
"licenses": [],
"categories": [{
    "supercategory": "Peach",
    "id": 1,
    "name": "Peach"
}]

}

显然,这种情况下的增强要复杂得多,因为不仅图像必须扭曲和旋转,而且边界框也是如此。

有没有办法用Keras做到这一点?