Question

我正在尝试在caffe中创建一个单一的多类多标签网络配置。

比方说狗的分类：狗是大还是小？（班级）是什么颜色？（上课）有领吗？（标签）

使用咖啡可以做这件事吗？这样做的正确方法是什么？

只是想了解实践方法。创建包含文本的所有标签的2个.text文件（一个用于训练，一个用于验证）后，例如：

base

运行py脚本：

library

并创建train.h5和val.h5（X数据集包含图像，Y包含标签吗？）

从以下位置替换我的网络输入层：

/train/img/1.png 0 4 18
/train/img/2.png 1 7 17 33
/train/img/3.png 0 4 17

到

import h5py, os
import caffe
import numpy as np

SIZE = 227 # fixed size to all images
with open( 'train.txt', 'r' ) as T :
    lines = T.readlines()
# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' ) 
y = np.zeros( (len(lines),1), dtype='f4' )
for i,l in enumerate(lines):
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0] )
    img = caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size
    # you may apply other input transformations here...
    # Note that the transformation should take img from size-by-size-by-3 and transpose it to 3-by-size-by-size
    # for example
    transposed_img = img.transpose((2,0,1))[::-1,:,:] # RGB->BGR
    X[i] = transposed_img
    y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
    H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
    H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
    L.write( 'train.h5' ) # list all h5 files you are going to use

我猜HDF5不需要mean.binaryproto吗？

接下来，如何改变输出层以输出多个标签概率？我想我需要交叉熵层而不是softmax吗？这是当前的输出层：

layers { 
 name: "data" 
 type: DATA 
 top:  "data" 
 top:  "label" 
 data_param { 
   source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/train_db" 
   backend: LMDB 
   batch_size: 64 
 } 
 transform_param { 
    crop_size: 227 
    mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto" 
    mirror: true 
  } 
  include: { phase: TRAIN } 
} 
layers { 
 name: "data" 
 type: DATA 
 top:  "data" 
 top:  "label" 
 data_param { 
   source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/val_db"  
   backend: LMDB 
   batch_size: 64
 } 
 transform_param { 
    crop_size: 227 
    mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto" 
    mirror: true 
  } 
  include: { phase: TEST } 
}

Answer 1

平均减法

虽然lmdb输入数据层可以为您处理各种输入转换，但是"HDF5Data"层不支持此功能。
因此，在创建hdf5文件时，必须处理所有输入转换（尤其是均值减法）。
查看您的代码在何处显示

# you may apply other input transformations here...

多个标签

尽管.txt为每个图像列出了几个标签，但是您只将第一个标签保存到hdf5文件中。如果要使用这些标签，则必须将它们喂入网络。
您的示例中立即出现的一个问题是，每个训练图像没有固定数量的标签-为什么？什么意思？
假设每个图像都有三个标签（在.txt文件中）：

<文件名> <狗的大小> <狗的颜色> <有项圈>

然后，您的hdf5中可以包含y_size，y_color和y_collar（而不是单个y）。

y_size[i] = float(spl[1])
y_color[i] = float(spl[2])
y_collar[i] = float(spl[3])

您的输入数据层将相应地具有更多"top"：

layer {
  type: "HDF5Data"
  top: "X" # same name as given in create_dataset!
  top: "y_size"
  top: "y_color"
  top: "y_collar"
  hdf5_data_param {
    source: "train_h5_list.txt" # do not give the h5 files directly, but the list.
    batch_size: 32
  }
  include { phase:TRAIN }
}

预测

当前，您的网络仅预测单个标签（带有top: "prob"的图层）。您需要使用网络来预测所有三个标签，因此需要添加计算top: "prob_size"，top: "prob_color"和top: "prob_collar"的层（每个"prob_*"的不同层）。
对每个标签进行预测后，就需要损失（同样，每个标签也将损失）。

使用Caffe的多类多标签图像分类

1 个答案:

平均减法

多个标签

预测