我不确定stackoverflow是否是问这个问题的正确地方,但是它就在这里:
根据本文,该架构设法达到94.75%的准确性 但到目前为止,我的实现最多可以达到82%的精度!
所以我的问题是:
关于卷积层堆叠:可以堆叠吗
conv->BatchNorm->Relu->(optional)max pooling
或订单是
不一样?
我在这里做错了什么?
请注意,我尝试了各种dropout值,但没有dropout(原因是我可以在训练集上获得100%的准确度,但在测试集(测试集的一半)上可以获得80%的准确度)较不密集的图层,以较高的学习速度玩耍 (增加,如您所见-减少),
任何建议都会被采纳!
我已经通过内核权重初始化对它进行了一些改进 (在每个转化层上)
initializers.VarianceScaling(scale=1.0, mode='fan_in', distribution='normal')
但是它仍然在测试集上达到了最高86%的准确度,并且如果我引入辍学而不是提高测试集上的准确性(从我的理解中,辍学应该可以更好地泛化模型),仍然不是本文声称的目标准确性。实现
任何帮助都会得到真正的帮助! 我的代码:
def create_conv_block(X,filters = 64,kernel=[3,3],strides=[1,1],
repetition=1,withMaxPooling=True,
pool_kernel = [2,2], pool_strides = [2,2],
relualpha= 0,withDropOut=True,dropout_precent=0.5):
conv_layer = X
while(repetition > 0):
conv_layer = layers.Conv2D(filters=filters,
kernel_size=kernel,
strides=strides, padding='same')(conv_layer)
conv_layer = layers.BatchNormalization()(conv_layer)
conv_layer = layers.LeakyReLU(alpha=relualpha)(conv_layer)
if withMaxPooling:
try:
conv_layer = layers.MaxPooling2D(pool_size=pool_kernel,
strides=pool_strides)(conv_layer)
except:
conv_layer = layers.MaxPooling2D(pool_size=pool_kernel, strides=pool_strides, padding='same')(
conv_layer)
if withDropOut:
conv_layer = layers.Dropout(rate=dropout_precent)(conv_layer)
repetition -= 1
return conv_layer
def train(model_name):
#https://arxiv.org/pdf/1608.06037.pdf
global inputs, res
batch_size = 100
input_shape = (32, 32, 3)
inputs = layers.Input(shape=input_shape)
block1 = create_conv_block(inputs,withMaxPooling=False,withDropOut=True)
block2 = create_conv_block(block1,filters=128,repetition=3,withDropOut=True)
block3 = create_conv_block(block2,filters=128,repetition=2,withMaxPooling=False)
block4 = create_conv_block(block3,filters=128,withDropOut=False)
block5 = create_conv_block(block4,filters=128,repetition=2,withDropOut=True)
block6 = create_conv_block(block5, filters=128, withMaxPooling=False,withDropOut=True)
block7 = create_conv_block(block6, filters=128, withMaxPooling=False,kernel=[1,1],withDropOut=True)
block8 = create_conv_block(block7, filters=128,kernel=[1,1],withDropOut=False)
block9 = create_conv_block(block8, filters=128,withDropOut=True)
block9 = create_conv_block(block9, filters=128,withDropOut=False)
flatty = layers.Flatten()(block9)
dense1 = layers.Dense(128,activation=activations.relu)(flatty)
dense1 = layers.Dropout(0.5)(dense1)
dense1 = layers.Dense(512,activation=activations.relu)(dense1)
dense1 = layers.Dropout(0.2)(dense1)
dense2 = layers.Dense(512,activation=activations.relu)(dense1)
dense1 = layers.Dropout(0.5)(dense2)
dense2 = layers.Dense(512,activation=activations.relu)(dense1)
dense3 = layers.Dropout(0.2)(dense2)
res = layers.Dense(10, activation='softmax')(dense3)
model = models.Model(inputs=inputs, outputs=res)
opt = optimizers.Adam(lr=0.001)
model.compile(optimizer=opt, loss=losses.categorical_crossentropy, metrics=['accuracy'])
model.summary()
reduce_lr = keras.callbacks.ReduceLROnPlateau( factor=0.1, patience=5, min_lr=1e-10)
keras.utils.plot_model(model, to_file=model_name + '.png', show_shapes=True, show_layer_names=True)
model.fit(x=train_X, y=train_y, batch_size=batch_size, epochs=100,
validation_data=(test_X[:len(test_X) // 2], test_y[:len(test_X) // 2]),
callbacks=[reduce_lr])
model.save(model_name +'.h5')
return model
name = 'kis_convo_drop'
model = train(name)
答案 0 :(得分:0)
它在github上具有官方实现,您可以看到。
SimpleNet是官方的原始Caffe
实现,而SimpleNet in Pytorch是官方的Pytorch
实现。
除此之外,我注意到您正在实现不同的体系结构!您的实现与您尝试实现的实现不同。
您正在使用Dense
层,如SimpleNet
所示,
仅convolutional
层,唯一的dense
层是
分类。
您使用的是Leaky ReLU
而不是ReLU
。
Adam
优化器,而他们在其优化器中使用了Adadelta
实施。