烤宽面条NN奇怪的行为准确和重量收敛

时间:2016-12-18 21:59:53

标签: python neural-network theano conv-neural-network lasagne

!!无法用图片来解释!! 我有一个神经网络,用文字新闻提供,用BagOfWOrds和标签{0,1}编码。我用卷积对文本进行分类。一切似乎都在起作用,但是,如果我有一些小数据(20 000条消息),列车的准确性不会收敛,测试的准确性本身就很准确(X轴是批次,绿色是测试,蓝色是火车): enter image description here

然后,我想象了我的NN中每一层的wieghts,它让我感到惊讶和困惑(X轴是批次):

These are weigts of Conv layer which is 1st in NN

并且

These are weihts of Dense layer whis is pre-last in NN

1。我真的无法解释,为什么权重收集在第20批并且之后不会改变,而根据准确性,它们应该是!!

2。为什么测试准确性的行为如此奇怪(绿色线) 我希望这只是代码...

topic_input = lasagne.layers.InputLayer(shape=(None, v_train.shape[1]), input_var=v_t)
embedding = lasagne.layers.EmbeddingLayer(topic_input, input_size=len(token_to_id)+1, output_size=32)
WHAT = lasagne.layers.DimshuffleLayer(embedding, [0,2,1])
conv_1 = lasagne.layers.Conv1DLayer(WHAT, num_filters=15, filter_size=4)
conv_2 = lasagne.layers.Conv1DLayer(conv_1, num_filters=5, filter_size=3)
dense_1 = lasagne.layers.DenseLayer(conv_2, 30)
dense_2 = lasagne.layers.DenseLayer(dense_1, 5)
dense_3 = lasagne.layers.DenseLayer(dense_2, 1, nonlinearity=lasagne.nonlinearities.sigmoid)

weights = lasagne.layers.get_all_params(dense_3,trainable=True)
prediction = lasagne.layers.get_output(dense_3)
loss = lasagne.objectives.binary_crossentropy(prediction, target).mean()
updates = lasagne.updates.adam(loss, weights, learning_rate=0.01)
accuracy = lasagne.objectives.binary_accuracy(prediction, target).mean()

train_func = theano.function([v_t, target], [loss, prediction, accuracy, weights[7]], updates=updates)
acc_func = theano.function([v_t, target], [accuracy, prediction])

0 个答案:

没有答案