Tensorflow对象检测API 1通道图像

时间:2018-02-12 10:50:49

标签: tensorflow object-detection depth

有没有办法在Tensorflow的物体检测API中使用预训练模型,该模型针对RGB图像进行训练,用于单通道灰度图像(深度)?

1 个答案:

答案 0 :(得分:1)

我尝试使用以下方法在Tensorflow中使用预训练的模型(faster_rcnn_resnet101_coco_11_06_2017)在灰度(1通道图像)上执行对象检测。它确实对我有用。

该模型在RGB图像上进行了训练,因此我只需要在Tensorflow存储库中的object_detection_tutorial.ipynb中修改某些代码即可。

首次更改: 请注意,ipynb中的exisitng代码是针对3个通道图像编写的,因此请更改load_image_into_numpy数组函数,如下所示

来自

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

收件人

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  channel_dict = {'L':1, 'RGB':3} # 'L' for Grayscale, 'RGB' : for 3 channel images
  return np.array(image.getdata()).reshape(
      (im_height, im_width, channel_dict[image.mode])).astype(np.uint8)

第二次更改:灰度图像仅包含1个通道中的数据。要执行目标检测,我们需要3个通道(推理代码是为3个通道编写的)

这可以通过两种方式实现。 a)将单通道数据复制到另外两个通道中 b)用零填充其他两个通道。 它们都可以用,我用第一种方法

在ipynb中,进入读取图像并将其转换为numpy数组的部分(ipynb末尾的forloop)。

更改代码自:

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)

对此:

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  if image_np.shape[2] != 3:  
      image_np = np.broadcast_to(image_np, (image_np.shape[0], image_np.shape[1], 3)).copy() # Duplicating the Content
      ## adding Zeros to other Channels
      ## This adds Red Color stuff in background -- not recommended 
      # z = np.zeros(image_np.shape[:-1] + (2,), dtype=image_np.dtype)
      # image_np = np.concatenate((image_np, z), axis=-1)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)

就这样,运行文件,您应该会看到结果。 这些是我的结果

Detection on Grayscale Image Detection on RGB Image