我正致力于为多标签文本文档分类创建神经网络。 我在Vector(V)中存储了3750个单词的词汇。
对于每个输入文档,我创建一个大小为3750的向量(I)。如果在向量(V)中x索引处的词汇表中找到输入文档中的术语,则向量中的x索引标记为1,否则为0例子。 [1,1,0,0,0,1,......,0]
对于标签,我有一个存储在Vector(L)中的1500个标签的词汇表。 如上所述,我为每个文档创建了一个向量(LB),如果文档有标签x,则将第i个索引标记为1。
标签数据也表示为具有1550个元素的向量,如[0,0,1,0,1,....,0]。第i个元素指示第i个标签是否是文本的肯定标签。文本的标签数量因文本而异。
这是我的代码。
from __future__ import division
import tensorflow as tf
import numpy as np
import time
def csv_to_numpy_array(filePath, delimiter):
return np.genfromtxt(filePath, delimiter=delimiter, dtype=None)
def import_data():
print("Load training data")
trainX = csv_to_numpy_array("/home/shahzeb/temp/train_data/trainX.csv", delimiter=",")
trainY = csv_to_numpy_array("/home/shahzeb/temp/train_data/trainY.csv", delimiter=",")
return trainX, trainY
startTime = time.time()
trainX, trainY = import_data()
learning_rate = 0.001
training_epochs = 500
# Network Parameters
n_hidden_1 = 3560 # 1st layer number of features
n_hidden_2 = 3560 # 2nd layer number of features
n_input = trainX.shape[1]
n_classes = trainY.shape[1]
# tf Graph input
input_neurons = tf.placeholder("float", [None, n_input],name="input")
known_outputs = tf.placeholder("float", [None, n_classes],name="labels")
def model(x):
with tf.name_scope("Relu_activation"):
# Hidden layer with RELU activation
w1 = tf.Variable(tf.random_normal([n_input, n_hidden_1]), name="w")
b1 = tf.Variable(tf.random_normal([n_hidden_1]), name="b")
layer_1 = tf.add(tf.matmul(x, w1), b1)
layer_1 = tf.nn.relu(layer_1)
# Hidden layer with sigmoid activation
with tf.name_scope("Sigmoid"):
w2 = tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2]), name="w")
b2 = tf.Variable(tf.random_normal([n_hidden_2]), name="b")
layer_2 = tf.add(tf.matmul(layer_1, w2), b2)
layer_2 = tf.nn.sigmoid(layer_2)
# Output layer with linear activation
with tf.name_scope("output"):
w3 = tf.Variable(tf.random_normal([n_hidden_2, n_classes]), name="w")
b3 = tf.Variable(tf.random_normal([n_classes]), name="b")
out_layer = tf.matmul(layer_2, w3) + b3
return out_layer,w1,w2,w3
model_output_OP, w_1,w_2,w_3 = model(input_neurons)
with tf.name_scope("cost"):
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=model_output_OP, labels=known_outputs))
with tf.name_scope("train"):
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
with tf.name_scope("accuracy"):
correct_predictions_OP = tf.equal(tf.argmax(model_output_OP, 1), tf.argmax(known_outputs, 1))
accuracy_OP = tf.reduce_mean(tf.cast(correct_predictions_OP, "float"), name="Accuracy_op")
with tf.name_scope("summary"):
model_output_OP_summary = tf.summary.histogram("output", model_output_OP)
accuracy_OP_summary = tf.summary.scalar("accuracy",accuracy_OP)
cost_summary = tf.summary.scalar("cost",cost)
summary_op = tf.summary.merge_all()
writer = tf.summary.FileWriter("/home/shahzeb/temp/summarylogs/", graph=tf.get_default_graph())
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
_, c, summary,train_accuracy,iw1,iw2,iw3 = sess.run([optimizer, cost,summary_op,accuracy_OP,
w_1,w_2,w_3
],
feed_dict={input_neurons: trainX, known_outputs: trainY})
print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(c), "Accuracy =",train_accuracy)
#np.set_printoptions(threshold=np.nan)
#print(iw1)
#print(iw2)
#print(iw3)
#rint("--------------")
writer.add_summary(summary, epoch + 1)
saver = tf.train.Saver()
saver.save(sess, "/home/shahzeb/temp/trained_model/hidden_layer_nn.ckpt")
print("Done")
为什么在一定数量的纪元之后成本函数值会增加。有什么问题,我该如何解决。
答案 0 :(得分:0)
问题的关键是时代的数量。对于给定的训练数据集,您可能使用的太多了。所以可能的解释是你在150左右的某个时刻开始过度适应。
如何处理此类问题有一个很好的discussion on forums.fast.ai。一个简单的解决方案是实现一个早期停止"机制,可以使用Tensorflow的验证监视器as described in this tutorial in the Tensorflow documentation完成。