我无法让Caffe工作

时间:2017-06-11 15:40:18

标签: machine-learning neural-network deep-learning caffe cross-entropy

经过一番挣扎,我决定尝试一个最简单的任务,训练一个网络来分类一个数字是非负面的。我失败了......

我使用以下代码生成数据。而且我不确定它是否正确。我从文件中读回数据,虽然看起来不错......

#pragma comment(lib, "hdf5")
#pragma comment(lib, "hdf5_cpp")

#include <cstdint>

#include <array>
#include <random>
#include <vector>

using namespace std;

#include <H5Cpp.h>

using namespace H5;

mt19937 rng;

float randf(float i_min, float i_max)
{
    return rng() * ((i_max - i_min) / 0x100000000) + i_min;
}

#define NAME "pos_neg"

#define TRAIN_SET_SIZE 0x100000
#define TEST_SET_SIZE 0x10000

void make(const string &i_cat, uint32_t i_count)
{
    H5File file(NAME "." + i_cat + ".h5", H5F_ACC_TRUNC);

    hsize_t dataDim[2] = { i_count, 1 };
    hsize_t labelDim = i_count;

    FloatType dataType(PredType::NATIVE_FLOAT);
    DataSpace dataSpace(2, dataDim);

    DataSet dataSet = file.createDataSet("data", dataType, dataSpace);

    IntType labelType(PredType::NATIVE_INT);
    DataSpace labelSpace(1, &labelDim);

    DataSet labelSet = file.createDataSet("label", labelType, labelSpace);

    vector<float> data(i_count);
    vector<int> labels(i_count);

    for (uint32_t i = 0; i < i_count / 2; ++i)
    {
        labels[i * 2] = 0;
        data[i * 2] = randf(0.f, 1.f);

        labels[i * 2 + 1] = 1;
        data[i * 2 + 1] = randf(-1.f, 0.f);
    }

    dataSet.write(&data[0], PredType::NATIVE_FLOAT);
    labelSet.write(&labels[0], PredType::NATIVE_INT);
}

int main()
{
    make("train", TRAIN_SET_SIZE);
    make("test", TEST_SET_SIZE);
}

网络看起来像这样

name: "PosNegNet"
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "pos_neg_train.txt"
    batch_size: 64
  }
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "pos_neg_test.txt"
    batch_size: 65536
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc1"
  bottom: "label"
  top: "loss"
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc1"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}

并且我尝试了一组参数

net: "pos_neg.prototxt"
test_iter: 1
test_interval: 500
base_lr: 0.001
momentum: 0.9
momentum2: 0.999
lr_policy: "fixed"
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "pos_neg"
type: "Adam"
solver_mode: GPU

我在Windows上运行了caffe.exe。我总是得到损失= 0,准确度= 0.5。

我知道我一定做错了什么,但除了挖掘源代码之外,我不知道从哪里看,...

我发现咖啡很慢。对于1080Ti上每批1024个项目的浮点[64]数据,我每秒只有大约16次迭代。这是正常还是我又做错了什么?

1 个答案:

答案 0 :(得分:1)

num_output: 2中设置"fc1":当使用"SoftmaxWithLoss"和/或"Accuracy"图层时,caffe希望您的预测为类的向量概率。在你的情况下,你有两个类,因此这个向量应该是长度为2(而不是目前的1)。

或者,您可以保留num_output: 1并将损失转换为"SigmoidCrossEntropyLoss"图层。但是,您将无法再使用"Accuracy"图层了...