准确性不会改变

时间:2017-11-25 15:34:22

标签: neural-network deep-learning classification caffe digits

我正在尝试在caffe中训练一个二元分类模型,告诉输入图像是狗还是背景。我有8223个阳性样本和33472个阴性样本。我的验证集包含1200个样本,每个类600个。事实上,我的积极因素是从MS-COCO数据集中获取的片段。所有图像都调整大小,因此biiger维度不超过92,较小的维度不小于44.使用create_imagenet.sh(resize = false)创建LMDB文件后,我开始使用求解器进行训练并训练.prototxt'在下面。问题是我得到一个恒定的准确度(0.513333或0.486667),这表明网络没有学习任何东西。 我希望有人能够提供帮助 先谢谢你了

解算器文件:

    iter_size: 32
    test_iter: 600
    test_interval: 20
    base_lr: 0.001
    display: 2
    max_iter: 20000
    lr_policy: "step"
    gamma: 0.99
    stepsize: 700
    momentum: 0.9
    weight_decay: 0.0001
    snapshot: 40
    snapshot_prefix: "/media/DATA/classifiers_data/dog_object/models/"
    solver_mode: GPU
    net: "/media/DATA/classifiers_data/dog_object/net.prototxt"
    solver_type: ADAM

train.prototxt:

    layer {
      name: "train-data"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }

      data_param {
        source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_train_lmdb"
        batch_size: 1
        backend: LMDB
      }
    }
    layer {
      name: "val-data"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TEST
      }
      data_param {
        source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_val_lmdb"
        batch_size: 1
        backend: LMDB
      }
    }

    layer {
      name: "scale"
      type: "Power"
      bottom: "data"
      top: "scale"
      power_param {
        scale: 0.00390625

      }
    }

    layer {
      bottom: "scale"
      top: "conv1_1"
      name: "conv1_1"
      type: "Convolution"
      convolution_param {
        num_output: 64
        pad: 1
        kernel_size: 3
      }
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 1
      }
    }
    layer {
      bottom: "conv1_1"
      top: "conv1_1"
      name: "relu1_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv1_1"
      top: "conv1_2"
      name: "conv1_2"
      type: "Convolution"
      convolution_param {
        num_output: 64
        pad: 1
        kernel_size: 3
      }
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 1
      }
    }

    layer {
      bottom: "conv1_2"
      top: "conv1_2"
      name: "relu1_2"
      type: "ReLU"
    }
    layer {
      name: "spatial_pyramid_pooling"
      type: "SPP"
      bottom: "conv1_2"
      top: "spatial_pyramid_pooling"
      spp_param {
        pool: MAX
        pyramid_height : 4
      }
    }
    layer {
      bottom: "spatial_pyramid_pooling"
      top: "fc6"
      name: "fc6"
      type: "InnerProduct"
      inner_product_param {
        num_output: 64
      }
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 1
      }
    }
    layer {
      bottom: "fc6"
      top: "fc6"
      name: "relu6"
      type: "ReLU"
    }
    layer {
      bottom: "fc6"
      top: "fc6"
      name: "drop6"
      type: "Dropout"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      bottom: "fc6"
      top: "fc7"
      name: "fc7"
      type: "InnerProduct"
      inner_product_param {
        num_output: 2
      }
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 1
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "fc7"
      bottom: "label"
      top: "loss"
    }
    layer {
      name: "accuracy/top1"
      type: "Accuracy"
      bottom: "fc7"
      bottom: "label"
      top: "accuracy"
      include: { phase: TEST }

    }

培训日志的一部分:

I1125 15:52:36.604038 2326 solver.cpp:362]迭代40,测试网(#0)

I1125 15:52:36.604071 2326 net.cpp:723]忽略源层列车数据

I1125 15:52:47.127979 2326 solver.cpp:429]测试净输出#0:准确度= 0.486667

I1125 15:52:47.128067 2326 solver.cpp:429]测试净输出#1:损失= 0.694894(* 1 = 0.694894损失)

I1125 15:52:48.937928 2326 solver.cpp:242]迭代40(0.141947 iter / s,14.0897s / 2 iter),损失= 0.67717

I1125 15:52:48.938014 2326 solver.cpp:261]火车净输出#0:损失= 0.655692(* 1 = 0.655692损失)

I1125 15:52:48.938040 2326 sgd_solver.cpp:106]迭代40,lr = 0.001

I1125 15:52:52.858757 2326 solver.cpp:242]迭代42(0.510097 iter / s,3.92083s / 2 iter),损失= 0.673962

I1125 15:52:52.858841 2326 solver.cpp:261]火车净输出#0:损失= 0.653978(* 1 = 0.653978损失)

I1125 15:52:52.858875 2326 sgd_solver.cpp:106]迭代42,lr = 0.001

I1125 15:52:56.581573 2326 solver.cpp:242]迭代44(0.53723 iter / s,3.7228s / 2 iter),损失= 0.673144

I1125 15:52:56.581656 2326 solver.cpp:261]火车净输出#0:损失= 0.652269(* 1 = 0.652269损失)

I1125 15:52:56.581689 2326 sgd_solver.cpp:106]迭代44,lr = 0.001

I1125 15:53:00.192082 2326 solver.cpp:242]迭代46(0.553941 iter / s,3.61049s / 2 iter),损失= 0.669606

I1125 15:53:00.192167 2326 solver.cpp:261]列车净输出#0:损失= 0.650559(* 1 = 0.650559损失)

I1125 15:53:00.192200 2326 sgd_solver.cpp:106]迭代46,lr = 0.001

I1125 15:53:04.195417 2326 solver.cpp:242]迭代48(0.499585 iter / s,4.00332s / 2 iter),损失= 0.674327

I1125 15:53:04.195691 2326 solver.cpp:261]列车净输出#0:损失= 0.648808(* 1 = 0.648808损失)

I1125 15:53:04.195736 2326 sgd_solver.cpp:106]迭代48,lr = 0.001

I1125 15:53:07.856842 2326 solver.cpp:242]迭代50(0.546265 iter / s,3.66123s / 2 iter),损失= 0.661835

I1125 15:53:07.856925 2326 solver.cpp:261]列车净输出#0:损失= 0.647097(* 1 = 0.647097损失)

I1125 15:53:07.856957 2326 sgd_solver.cpp:106]迭代50,lr = 0.001

I1125 15:53:11.681635 2326 solver.cpp:242]迭代52(0.522906 iter / s,3.82478s / 2 iter),损失= 0.66071

I1125 15:53:11.681720 2326 solver.cpp:261]列车净输出#0:损失= 0.743264(* 1 = 0.743264损失)

I1125 15:53:11.681754 2326 sgd_solver.cpp:106]迭代52,lr = 0.001

I1125 15:53:15.544859 2326 solver.cpp:242]迭代54(0.517707 iter / s,3.86319s / 2 iter),损失= 0.656414

I1125 15:53:15.544950 2326 solver.cpp:261]列车净输出#0:损失= 0.643741(* 1 = 0.643741损失)

I1125 15:53:15.544986 2326 sgd_solver.cpp:106]迭代54,lr = 0.001

I1125 15:53:19.354320 2326 solver.cpp:242]迭代56(0.525012 iter / s,3.80943s / 2 iter),损失= 0.645277

I1125 15:53:19.354404 2326 solver.cpp:261]火车净输出#0:损失= 0.747059(* 1 = 0.747059损失)

I1125 15:53:19.354431 2326 sgd_solver.cpp:106]迭代56,lr = 0.001

I1125 15:53:23.195466 2326 solver.cpp:242]迭代58(0.520681 iter / s,3.84112s / 2 iter),损失= 0.677604

I1125 15:53:23.195549 2326 solver.cpp:261]列车净输出#0:损失= 0.640145(* 1 = 0.640145损失)

I1125 15:53:23.195575 2326 sgd_solver.cpp:106]迭代58,lr = 0.001

I1125 15:53:25.140920 2326 solver.cpp:362]迭代60,测试网(#0)

I1125 15:53:25.140965 2326 net.cpp:723]忽略源层列车数据

I1125 15:53:35.672775 2326 solver.cpp:429]测试净输出#0:准确度= 0.513333

I1125 15:53:35.672937 2326 solver.cpp:429]测试净输出#1:损失= 0.69323(* 1 = 0.69323损失)

I1125 15:53:37.635395 2326 solver.cpp:242]迭代60(0.138503 iter / s,14.4401s / 2 iter),损失= 0.655983

I1125 15:53:37.635478 2326 solver.cpp:261]列车净输出#0:损失= 0.638368(* 1 = 0.638368损失)

I1125 15:53:37.635512 2326 sgd_solver.cpp:106]迭代60,lr = 0.001

I1125 15:53:41.458472 2326 solver.cpp:242]迭代62(0.523143 iter / s,3.82305s / 2 iter),损失= 0.672996

I1125 15:53:41.458555 2326 solver.cpp:261]火车净输出#0:损失= 0.753101(* 1 = 0.753101损失)

I1125 15:53:41.458588 2326 sgd_solver.cpp:106]迭代62,lr = 0.001

I1125 15:53:45.299643 2326 solver.cpp:242]迭代64(0.520679 iter / s,3.84114s / 2 iter),损失= 0.668675

I1125 15:53:45.299737 2326 solver.cpp:261]列车净输出#0:损失= 0.634894(* 1 = 0.634894损失)

1 个答案:

答案 0 :(得分:0)

一些评论:
1.您的测试集包含1200个样本,但每次只验证600个样本:test_iter * batch_size = 600。有关详细信息,请参阅this answer 2.您在创建lmdb时是否对训练数据进行了随机播放?有关详细信息,请参阅this answer 3.你如何开始你的体重?您的原型文件中似乎没有filler的调用。如果您没有明确定义filler s,则权重为init。对于SGD来说,这是一个非常困难的起点。有关详细信息,请参阅this answer 4.您是否尝试在求解器中设置debug_info: true并查看调试日志以查找问题的根本原因?有关详细信息,请参阅this thread