Question

我对Caffe很新。我正在做的是我有两套数据集（汽车和鲜花）的功能。

每个图像样本的特征尺寸为256-D。
训练集：500张汽车图像和1200朵花图像
测试集：100张汽车图像和200张花卉图像

基本上，问题是二元分类问题。我的caffe train.prototxt 文件如下：

layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "train.txt"
    batch_size: 40
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "test.txt"
    batch_size: 10
  }
  include {
    phase: TEST
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  inner_product_param {
    num_output: 256
    weight_filler {
      type: "gaussian"
      std: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "sigmoid1"
  type: "Sigmoid"
  bottom: "fc1"
  top: "sigmoid1"
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "sigmoid1"
  top: "fc2"
  inner_product_param {
    num_output: 256
    weight_filler {
      type: "gaussian"
      std: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "sigmoid2"
  type: "Sigmoid"
  bottom: "fc2"
  top: "sigmoid2"
}
layer {
  name: "fc3"
  type: "InnerProduct"
  bottom: "sigmoid2"
  top: "fc3"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc3"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc3"
  bottom: "label"
  top: "loss"
}

我正在使用HDF5层读取数据并将其传递到具有sigmoid激活功能的256-256-2的3个完全连接的层。（我也改为ReLU，但结果没有变化）。

The solver prototxt is: 
 net: "train.prototxt"
test_iter: 100
test_interval: 200
base_lr: 0.010
momentum: 0.9
weight_decay: 0.00005
lr_policy: "inv"
gamma: 0.00001
delta: 1e-8
#test_compute_loss: true
power: 0.75
display: 100
#stepsize: 1000
max_iter: 10000
snapshot: 10000
snapshot_prefix: "sample"
solver_mode: GPU

问题是这个架构不起作用，我认为这是由于网络没有学习任何东西。

该图显示了前500次迭代的准确度图，清楚地表明没有任何建设性的事情发生。

为了测试数据集，功能是否没有错，我使用LibSVM上的功能训练了一个线性SVM，它的准确度达到了84％。

也许我的网络设置不正确，如果有人可以帮助我实现这一点，那就太好了。感谢

--------------------

更新：使用PReLU获取以下图表。我将num_output从256减少到128：

Answer 1

确保对输入数据进行随机播放，使用较小的网络，并可能添加某种正规化，例如丢失。

Answer 2

你过度拟合，使用一些预训练的CNN（例如CIFAR）并在你的套装上进行微调。

在Caffe中训练完全连接的层时网络不学习

2 个答案: