Question

我打算使用像 faster_rcnn_resnet101_pets 这样的预训练模型在Tensorflow环境中进行对象检测，如here

所述

我收集了几张图像用于训练和测试。所有这些图像大小各异。是否需要将它们调整为通用尺寸？

faster_rcnn_resnet101_pets 使用输入大小为224x224x3的resnet。

这是否意味着我必须在发送训练之前重新调整所有图像的大小？或TF自动照顾它。

python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_resnet101_pets.config

通常，具有相同大小的图像是一种好习惯吗？

Answer 1

否，您不需要自己将输入图像调整为固定形状。 Tensorflow对象检测api的预处理步骤将调整所有输入图像的大小。以下是预处理步骤中定义的功能，并且有一个image_resizer_fn，它对应于名为{{1}的字段}在配置file中。

image_resizer

根据proto文件，您可以在4种不同的图像缩放器中进行选择，即

keep_aspect_ratio_resizer
fixed_shape_resizer
identity_resizer
conditional_shape_resizer

Here是模型def transform_input_data(tensor_dict, model_preprocess_fn, image_resizer_fn, num_classes, data_augmentation_fn=None, merge_multiple_boxes=False, retain_original_image=False, use_multiclass_scores=False, use_bfloat16=False): """A single function that is responsible for all input data transformations. Data transformation functions are applied in the following order. 1. If key fields.InputDataFields.image_additional_channels is present in tensor_dict, the additional channels will be merged into fields.InputDataFields.image. 2. data_augmentation_fn (optional): applied on tensor_dict. 3. model_preprocess_fn: applied only on image tensor in tensor_dict. 4. image_resizer_fn: applied on original image and instance mask tensor in tensor_dict. 5. one_hot_encoding: applied to classes tensor in tensor_dict. 6. merge_multiple_boxes (optional): when groundtruth boxes are exactly the same they can be merged into a single box with an associated k-hot class label.的示例配置文件，所有图像均使用min_dimension = 600和max_dimension = 1024进行了整形

faster_rcnn_resnet101_pets

实际上，调整大小后的图像的形状对检测速度与准确性的影响很大。尽管对输入图像的大小没有特殊要求，但是最好使所有最小尺寸的图像大于合理值，以使卷积运算正常工作。

在tensorflow中使用转移学习是否需要具有预定义的图像大小？

1 个答案: