CNN不会学习简单的几何图案

时间:2019-04-09 07:39:28

标签: caffe

这肯定是一个非常愚蠢的问题,但是由于我对窗台的存储了解不足,并且没有更多时间来寻找答案,因此我不得不将其放在这里寻求帮助。我通过程序生成了一个简单的几何形状(如三角形,正方形,菱形等)图像的训练数据集,并构建了具有两个卷积层和一个池化层(也是最后一个完全连接的层)的CNN,以学习这些形状的分类。但是网络只是不学习它。我的意思是损失不会减少。是什么原因?

在Caffe中,神经网络配置文件“ very_simple_one.prototxt”如下所示:

name: "very_simple_one"
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/train_valid_lmdb"
    batch_size: 1000
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/test_lmdb"
    batch_size: 100
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 5
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 5
    stride: 5
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 3
    kernel_size: 8
    stride: 8
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "fc3"
  type: "InnerProduct"
  bottom: "conv2"
  top: "fc3"
  inner_product_param {
    num_output: 3
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc3"
  bottom: "label"
}

“ solver.prototxt”如下所示:

net: "very_simple_one.prototxt"
type: "SGD"
test_iter: 15
test_interval: 100
base_lr: 0.05
lr_policy: "step"
gamma: 0.9999
stepsize: 100
display: 20
max_iter: 50000
snapshot: 2000
momentum: 0.9
weight_decay: 0.00000000000
solver_mode: GPU

还通过注释“动量”尝试了AdaGrad,并将“类型”修改为AdaGrad。 通过以下命令训练该网络:

....../caffe/build/tools/caffe train -solver solver.prototxt

所有人都未能训练。我的意思是损失不会减少。损失在很小的区间内徘徊,但从未真正减少。

只想知道数据集是否绝对不能接受训练,或者我的配置文件有问题,上述文件?

我还根据易卜拉欣·尤素夫(Ibrahim Yousuf)所说的修改了网络,将池化层替换为卷积层,如下所示:

name: "very_simple_one"
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/train_valid_lmdb"
    batch_size: 1000
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/test_lmdb"
    batch_size: 100
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 50
    kernel_size: 5
    ##stride: 5
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "conv1.5"
  type: "Convolution"
  bottom: "conv1"
  top: "conv1.5"
  convolution_param {
    num_output: 10
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "relu1.5"
  type: "ReLU"
  bottom: "conv1.5"
  top: "conv1.5"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "conv1.5"
  top: "conv2"
  convolution_param {
    num_output: 3
    kernel_size: 8
    stride: 4
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "fc3"
  type: "InnerProduct"
  bottom: "conv2"
  top: "fc3"
  inner_product_param {
    num_output: 3
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc3"
  bottom: "label"
}

但是损失仍然没有减少。是否应该确认原因是我的数据集?而且我的数据集确实很小,如果有人可以帮我,我可以将其上传到某个网络磁盘上以进行测试。

1 个答案:

答案 0 :(得分:0)

已解决。分类标签应从零开始而不是一个0、1、2用于三个分类问题,而不是1、2、3。