Question

我已经有一个网络接收1 image (RGB)作为输入数据并预测该类。但现在，我想使用ilsvrc2012数据集作为输入数据。如何使用python和caffe将多个图像作为网络的输入数据加载？

目前，仅使用此代码的1个输入图像：

# Load the image in the data layer
im = caffe.io.load_image(IMAGE_FILE)

net.blobs['data'].data[...] = transformer.preprocess('data', im)  # perform the preprocessing we've set up

# Compute forward
out = net.forward()

我的模型定义为：

name: "CaffeNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } }
}

我的最后一层是：

layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

Answer 1

有很多地方可以获得ilsvrc2012的输入图层。我强烈建议你应该有一个位置： $ CAFFE_ROOT / models / bvlc_alexnet / train_val.prototxt - 该文件顶部的数据层应该是你想要的。

基本的“技巧”是认识到形状属性中的第一个维度是批量大小，即每次迭代中将处理的图像数量。例如，

shape {
  dim: 256
  dim: 3
  dim: 227
  dim: 227
}

描述相同的输入，但同时接受和处理256张图像。

如何使用python和caffe加载多个图像作为网络的输入数据？

1 个答案: