我正在尝试使用caffe为图像实现像素级二进制分类。对于每个尺寸为3x256x256的图像,我有一个256x256标签数组,其中每个条目都标记为0或1.此外,当我使用下面的代码读取我的HDF5文件时,
dirname = "examples/hdf5_classification/data"
f = h5py.File(os.path.join(dirname, 'train.h5'), "r")
ks = f.keys()
data = np.array(f[ks[0]])
label = np.array(f[ks[1]])
print "Data dimension from HDF5", np.shape(data)
print "Label dimension from HDF5", np.shape(label)
我将数据和标签维度视为
Data dimension from HDF5 (402, 3, 256, 256)
Label dimension from HDF5 (402, 256, 256)
我正在尝试将此数据提供给给定的hdf5分类网络,并且在训练时,我有以下输出(使用默认解算器,但在GPU模式下)。
!cd /home/unni/MTPMain/caffe-master/ && ./build/tools/caffe train -solver examples/hdf5_classification/solver.prototxt
给出
I1119 01:29:02.222512 11910 caffe.cpp:184] Using GPUs 0 I1119 01:29:02.509752 11910 solver.cpp:47] Initializing solver from parameters: train_net: "examples/hdf5_classification/train_val.prototxt" test_net: "examples/hdf5_classification/train_val.prototxt" test_iter: 250 test_interval: 1000 base_lr: 0.01 display: 1000 max_iter: 10000 lr_policy: "step" gamma: 0.1 momentum: 0.9 weight_decay: 0.0005 stepsize: 5000 snapshot: 10000 snapshot_prefix: "examples/hdf5_classification/data/train" solver_mode: GPU device_id: 0 I1119 01:29:02.519805 11910 solver.cpp:80] Creating training net from train_net file: examples/hdf5_classification/train_val.prototxt I1119 01:29:02.520031 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer data I1119 01:29:02.520053 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy I1119 01:29:02.520104 11910 net.cpp:49] Initializing net from parameters: name: "LogisticRegressionNet" state { phase: TRAIN } layer { name: "data" type: "HDF5Data" top: "data" top: "label" include { phase: TRAIN } hdf5_data_param { source: "examples/hdf5_classification/data/train.txt" batch_size: 10 } } layer { name: "fc1" type: "InnerProduct" bottom: "data" top: "fc1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc1" bottom: "label" top: "loss" } I1119 01:29:02.520256 11910 layer_factory.hpp:76] Creating layer data I1119 01:29:02.520277 11910 net.cpp:106] Creating Layer data I1119 01:29:02.520290 11910 net.cpp:411] data -> data I1119 01:29:02.520331 11910 net.cpp:411] data -> label I1119 01:29:02.520352 11910 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt I1119 01:29:02.529341 11910 hdf5_data_layer.cpp:94] Number of HDF5 files: 1 I1119 01:29:02.542645 11910 hdf5.cpp:32] Datatype class: H5T_FLOAT I1119 01:29:10.601307 11910 net.cpp:150] Setting up data I1119 01:29:10.612926 11910 net.cpp:157] Top shape: 10 3 256 256 (1966080) I1119 01:29:10.612963 11910 net.cpp:157] Top shape: 10 256 256 (655360) I1119 01:29:10.612969 11910 net.cpp:165] Memory required for data: 10485760 I1119 01:29:10.612983 11910 layer_factory.hpp:76] Creating layer fc1 I1119 01:29:10.624948 11910 net.cpp:106] Creating Layer fc1 I1119 01:29:10.625015 11910 net.cpp:454] fc1 <- data I1119 01:29:10.625039 11910 net.cpp:411] fc1 -> fc1 I1119 01:29:10.645814 11910 net.cpp:150] Setting up fc1 I1119 01:29:10.645864 11910 net.cpp:157] Top shape: 10 2 (20) I1119 01:29:10.645875 11910 net.cpp:165] Memory required for data: 10485840 I1119 01:29:10.645912 11910 layer_factory.hpp:76] Creating layer loss I1119 01:29:10.657094 11910 net.cpp:106] Creating Layer loss I1119 01:29:10.657133 11910 net.cpp:454] loss <- fc1 I1119 01:29:10.657147 11910 net.cpp:454] loss <- label I1119 01:29:10.657163 11910 net.cpp:411] loss -> loss I1119 01:29:10.657189 11910 layer_factory.hpp:76] Creating layer loss F1119 01:29:14.883095 11910 softmax_loss_layer.cpp:42] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}. *** Check failure stack trace: *** @ 0x7f0652e1adaa (unknown) @ 0x7f0652e1ace4 (unknown) @ 0x7f0652e1a6e6 (unknown) @ 0x7f0652e1d687 (unknown) @ 0x7f0653494219 caffe::SoftmaxWithLossLayer<>::Reshape() @ 0x7f065353f50f caffe::Net<>::Init() @ 0x7f0653541f05 caffe::Net<>::Net() @ 0x7f06535776cf caffe::Solver<>::InitTrainNet() @ 0x7f0653577beb caffe::Solver<>::Init() @ 0x7f0653578007 caffe::Solver<>::Solver() @ 0x7f06535278b3 caffe::Creator_SGDSolver<>() @ 0x410831 caffe::SolverRegistry<>::CreateSolver() @ 0x40a16b train() @ 0x406908 main @ 0x7f065232cec5 (unknown) @ 0x406e28 (unknown) @ (nil) (unknown) Aborted
基本上错误是
softmax_loss_layer.cpp:42] Check failed:
outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360)
Number of labels must match number of predictions;
e.g., if softmax axis == 1 and prediction shape is (N, C, H, W),
label count (number of labels) must be N*H*W,
with integer values in {0, 1, ..., C-1}.
我无法理解为什么预期的标签数量与我的批量大小相同。我究竟应该如何解决这个问题?这是我的标签方法的问题吗?
答案 0 :(得分:2)
您的问题是"SoftmaxWithLoss"
图层尝试将每个输入图像的2个元素的预测矢量与每个图像尺寸为256×256的标签进行比较。
这毫无意义。
错误的根本原因:我猜你要做的就是将二进制分类器应用于图像中的每个像素。为此,您使用"InnerProduct"
将“fc1”定义为num_output=2
图层。但是,caffe认为这是因为你有一个二进制分类器应用于整个图像。因此,caffe为整个图像提供了单个二进制预测。
如何解决:在进行逐像素预测时,您不再需要使用"InnerProduct"
图层,而是拥有“完全卷积网”。如果用转换层替换“fc1”(例如,检查每个像素的5乘5环境的内核并根据此补丁做出决定):
layer {
name: "bin_class"
type: "Convolution"
bottom: "data"
top: "bin_class"
convolution_param {
num_output: 2 # binary class output
kernel_size: 5 # 5-by-5 patch for prediciton
pad: 2 # make sure spatial output size equals size of label
}
}
现在将"SoftmaxWithLoss"
应用于bottom: bin_class
和bottom: label
应该有效。