cifar10:在keras中实现模型,但准确性与文章不同

时间:2018-09-20 20:20:53

标签: python-3.x tensorflow keras

我不确定stackoverflow是否是问这个问题的正确地方,但是它就在这里:

我读了一篇论文:
Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures (2016)

根据本文,该架构设法达到94.75%的准确性 但到目前为止,我的实现最多可以达到82%的精度!

所以我的问题是:

  • 关于卷积层堆叠:可以堆叠吗 conv->BatchNorm->Relu->(optional)max pooling或订单是 不一样?

  • 我在这里做错了什么?

请注意,我尝试了各种dropout值,但没有dropout(原因是我可以在训练集上获得100%的准确度,但在测试集(测试集的一半)上可以获得80%的准确度)较不密集的图层,以较高的学习速度玩耍 (增加,如您所见-减少),

任何建议都会被采纳!

编辑:

我已经通过内核权重初始化对它进行了一些改进 (在每个转化层上)

initializers.VarianceScaling(scale=1.0, mode='fan_in', distribution='normal')

但是它仍然在测试集上达到了最高86%的准确度,并且如果我引入辍学而不是提高测试集上的准确性(从我的理解中,辍学应该可以更好地泛化模型),仍然不是本文声称的目标准确性。实现

任何帮助都会得到真正的帮助! 我的代码:

def create_conv_block(X,filters = 64,kernel=[3,3],strides=[1,1],
                  repetition=1,withMaxPooling=True,
                  pool_kernel = [2,2], pool_strides = [2,2],
                  relualpha= 0,withDropOut=True,dropout_precent=0.5):

      conv_layer = X
      while(repetition > 0):
          conv_layer = layers.Conv2D(filters=filters, 
                                     kernel_size=kernel,
                                     strides=strides, padding='same')(conv_layer)
          conv_layer = layers.BatchNormalization()(conv_layer)
          conv_layer = layers.LeakyReLU(alpha=relualpha)(conv_layer)
          if withMaxPooling:
              try:
                 conv_layer = layers.MaxPooling2D(pool_size=pool_kernel, 
                 strides=pool_strides)(conv_layer)
              except:
                 conv_layer = layers.MaxPooling2D(pool_size=pool_kernel, strides=pool_strides, padding='same')(
                conv_layer)

           if withDropOut:
                conv_layer = layers.Dropout(rate=dropout_precent)(conv_layer)

           repetition -= 1
      return conv_layer

def train(model_name):
#https://arxiv.org/pdf/1608.06037.pdf
  global inputs, res
  batch_size = 100
  input_shape = (32, 32, 3)
  inputs = layers.Input(shape=input_shape)

  block1 =  create_conv_block(inputs,withMaxPooling=False,withDropOut=True)
  block2 = create_conv_block(block1,filters=128,repetition=3,withDropOut=True)
  block3 = create_conv_block(block2,filters=128,repetition=2,withMaxPooling=False)
  block4 = create_conv_block(block3,filters=128,withDropOut=False)
  block5 = create_conv_block(block4,filters=128,repetition=2,withDropOut=True)
  block6 = create_conv_block(block5, filters=128, withMaxPooling=False,withDropOut=True)
  block7 = create_conv_block(block6, filters=128, withMaxPooling=False,kernel=[1,1],withDropOut=True)
  block8 = create_conv_block(block7, filters=128,kernel=[1,1],withDropOut=False)
  block9 = create_conv_block(block8, filters=128,withDropOut=True)
  block9 = create_conv_block(block9, filters=128,withDropOut=False)
  flatty = layers.Flatten()(block9)

  dense1 = layers.Dense(128,activation=activations.relu)(flatty)
  dense1 = layers.Dropout(0.5)(dense1)
  dense1 = layers.Dense(512,activation=activations.relu)(dense1)
  dense1 = layers.Dropout(0.2)(dense1)
  dense2 = layers.Dense(512,activation=activations.relu)(dense1)
  dense1 = layers.Dropout(0.5)(dense2)
  dense2 = layers.Dense(512,activation=activations.relu)(dense1)
  dense3 = layers.Dropout(0.2)(dense2)


  res = layers.Dense(10, activation='softmax')(dense3)
  model = models.Model(inputs=inputs, outputs=res)
  opt = optimizers.Adam(lr=0.001)
  model.compile(optimizer=opt, loss=losses.categorical_crossentropy, metrics=['accuracy'])
  model.summary()
  reduce_lr = keras.callbacks.ReduceLROnPlateau( factor=0.1, patience=5, min_lr=1e-10)
  keras.utils.plot_model(model, to_file=model_name + '.png', show_shapes=True, show_layer_names=True)
  model.fit(x=train_X, y=train_y, batch_size=batch_size, epochs=100,
         validation_data=(test_X[:len(test_X) // 2], test_y[:len(test_X) // 2]),
         callbacks=[reduce_lr])
  model.save(model_name +'.h5')
  return model

name = 'kis_convo_drop'
model = train(name)

1 个答案:

答案 0 :(得分:0)

它在github上具有官方实现,您可以看到。
SimpleNet是官方的原始Caffe实现,而SimpleNet in Pytorch是官方的Pytorch实现。
除此之外,我注意到您正在实现不同的体系结构!您的实现与您尝试实现的实现不同。

  • 您正在使用Dense层,如SimpleNet所示, 仅convolutional层,唯一的dense层是 分类。

  • 您使用的是Leaky ReLU而不是ReLU

  • 您正在使用Adam优化器,而他们在其优化器中使用了Adadelta 实施。