无法使AWS SageMaker读取RecordIO文件

时间:2019-07-15 01:22:24

标签: object-detection mxnet amazon-sagemaker

我正在尝试将对象检测lst文件转换为rec文件并在SageMaker中进行训练。我的列表如下所示:

10  2   5   9.0000  1008.0000   1774.0000   1324.0000   1953.0000   3.0000  2697.0000   3340.0000   948.0000    1559.0000   0.0000  0.0000  0.0000  0.0000  0.0000  IMG_1091.JPG
58  2   5   11.0000 1735.0000   2065.0000   1047.0000   1300.0000   6.0000  2444.0000   2806.0000   1194.0000   1482.0000   1.0000  2975.0000   3417.0000   1739.0000   2139.0000   IMG_7000.JPG
60  2   5   12.0000 1243.0000   1861.0000   1222.0000   1710.0000   6.0000  2423.0000   2971.0000   1205.0000   1693.0000   0.0000  0.0000  0.0000  0.0000  0.0000  IMG_7061.JPG
80  2   5   1.0000  1865.0000   2146.0000   818.0000    969.0000    14.0000 1559.0000   1918.0000   1658.0000   1914.0000   6.0000  2638.0000   3042.0000   2125.0000   2490.0000   IMG_9479.JPG
79  2   5   13.0000 1556.0000   1812.0000   1440.0000   1637.0000   7.0000  2216.0000   2452.0000   1595.0000   1816.0000   0.0000  0.0000  0.0000  0.0000  0.0000  IMG_9443.JPG

列在哪里

index, header length, object length, class id, xmin, ymin, xmax, ymax, (repeat any other ids...), image path

然后我通过

通过im2rec运行列表

$ /incubator-mxnet/tools/im2rec.py my_lst.lst my_image_folder

然后我将生成的.rec文件上传到s3。

然后我从this AWS sample notebook.提取必要的部分

我认为唯一的关​​键可能是这样:

def set_hyperparameters(num_epochs, lr_steps):
    num_classes = 16
    num_training_samples = 227
    print('num classes: {}, num training images: {}'.format(num_classes, num_training_samples))

    od_model.set_hyperparameters(base_network='resnet-50',
                                 use_pretrained_model=1,
                                 num_classes=num_classes,
                                 mini_batch_size=16,
                                 epochs=num_epochs,               
                                 learning_rate=0.001, 
                                 lr_scheduler_step=lr_steps,      
                                 lr_scheduler_factor=0.1,
                                 optimizer='sgd',
                                 momentum=0.9,
                                 weight_decay=0.0005,
                                 overlap_threshold=0.5,
                                 nms_threshold=0.45,
                                 image_shape=512,
                                 label_width=350,
                                 num_training_samples=num_training_samples)

set_hyperparameters(100, '33,67')

最终我得到了错误:Not enough label packed in img_list or rec file.

有人可以帮助我识别我缺少的哪些部分,以便正确训练SageMaker和RecordIO文件吗?

感谢您的帮助!

此外,如果我改用

$ /incubator-mxnet/tools/im2rec.py my_lst.lst my_image_folder --pass-through --pack-label

我得到了错误:

Expected number of batches: 14, did not match the number of batches processed: 5. This may happen when some images or annotations are invalid and cannot be parsed. Please check the dataset and ensure it follows the format in the documentation.

1 个答案:

答案 0 :(得分:0)

这可能来晚了,但是您是否在.lst文件中从0开始标记了类?

在您发布的链接中:

类应标有连续的数字,并以0开头。