更改输入数据时,Tensorflow模型未经过培训

时间:2017-02-17 14:26:51

标签: python tensorflow jupyter-notebook

我正在尝试使用张量流来训练使用交通标志图像的神经网络(LeNet)。我想检查预处理技术对nn性能的影响。因此,我预处理了图像并将结果(trainingimages,validationimages,testimages,final testimages)存储为dict中的元组。

然后我尝试迭代这个dict,然后使用tensorflow的训练和验证操作如下

import tensorflow as tf
from sklearn.utils import shuffle


output_data = []
EPOCHS = 5
BATCH_SIZE = 128
rate = 0.0005

for key in finalInputdata.keys():
    for procTypes in range(0,(len(finalInputdata[key]))):
        if np.shape(finalInputdata[key][procTypes][0]) != ():
            X_train = finalInputdata[key][procTypes][0]
            X_valid = finalInputdata[key][procTypes][1]
            X_test = finalInputdata[key][procTypes][2]
            X_finaltest = finalInputdata[key][procTypes][3]


            x = tf.placeholder(tf.float32, (None, 32, 32,np.shape(X_train)[-1]))
            y = tf.placeholder(tf.int32, (None))
            one_hot_y = tf.one_hot(y,43)

            # Tensor Operations
            logits = LeNet(x,np.shape(X_train)[-1])

            cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,one_hot_y)
            softmax_probability = tf.nn.softmax(logits)

            loss_operation = tf.reduce_mean(cross_entropy)
            optimizer = tf.train.AdamOptimizer(learning_rate=rate)
            training_operation = optimizer.minimize(loss_operation)
            correct_prediction = tf.equal(tf.argmax(logits,1), tf.argmax(one_hot_y,1))
            accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


                # Pipeline for training and evaluation

            sess = tf.InteractiveSession()

            sess.run(tf.global_variables_initializer())
            num_examples = len(X_train)

            print("Training on %s images processed as %s" %(key,dict_fornames['proctypes'][procTypes]))
            print()
            for i in range(EPOCHS):
                X_train, y_train = shuffle(X_train, y_train)
                for offset in range(0, num_examples, BATCH_SIZE):
                    end = offset + BATCH_SIZE
                    batch_x, batch_y = X_train[offset:end], y_train[offset:end]
                    sess.run(training_operation, feed_dict = {x: batch_x, y: batch_y})

                training_accuracy = evaluate(X_train,y_train)

                validation_accuracy = evaluate(X_valid, y_valid)

                testing_accuracy = evaluate(X_test, y_test)

                final_accuracy = evaluate(X_finaltest, y_finalTest)

                print("EPOCH {} ...".format(i+1))
                print("Training Accuracy = {:.3f}".format(training_accuracy))
                print("Validation Accuracy = {:.3f}".format(validation_accuracy))
                print()
                output_data.append({'EPOCHS':EPOCHS, 'LearningRate':rate, 'ImageType': 'RGB',\
                                    'PreprocType': dict_fornames['proctypes'][0],\
                                    'TrainingAccuracy':training_accuracy, 'ValidationAccuracy':validation_accuracy, \
                                    'TestingAccuracy': testing_accuracy})


            sess.close()

评估功能如下

def evaluate(X_data, y_data):
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0,num_examples, BATCH_SIZE):
        batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
        accuracy = sess.run(accuracy_operation, feed_dict = {x:batch_x, y:batch_y})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples

一旦我执行程序,它适用于数据集的第一次迭代,但是从第二次迭代开始,网络不会训练并继续为所有其他迭代执行此操作。

Training on RGB images processed as Original

EPOCH 1 ...
Training Accuracy = 0.525
Validation Accuracy = 0.474

EPOCH 2 ...
Training Accuracy = 0.763
Validation Accuracy = 0.682

EPOCH 3 ...
Training Accuracy = 0.844
Validation Accuracy = 0.723

EPOCH 4 ...
Training Accuracy = 0.888
Validation Accuracy = 0.779

EPOCH 5 ...
Training Accuracy = 0.913
Validation Accuracy = 0.795

Training on RGB images processed as Mean Subtracted Data

EPOCH 1 ...
Training Accuracy = 0.056
Validation Accuracy = 0.057

EPOCH 2 ...
Training Accuracy = 0.057
Validation Accuracy = 0.057

EPOCH 3 ...
Training Accuracy = 0.057
Validation Accuracy = 0.056

EPOCH 4 ...
Training Accuracy = 0.058
Validation Accuracy = 0.056

EPOCH 5 ...
Training Accuracy = 0.058
Validation Accuracy = 0.058

Training on RGB images processed as Normalized Data

EPOCH 1 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 2 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 3 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 4 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 5 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

但是,如果我重新启动内核并使用任何数据类型(任何迭代),它都可以。我发现我必须清除图形或为多种数据类型运行多个会话,但我还不清楚如何做到这一点。我尝试使用tf.reset_default_graph(),但似乎没有任何效果。有人能指出我正确的方向吗?

由于

1 个答案:

答案 0 :(得分:1)

您可能想要尝试将数据归一化为零均值和单位差异,然后再将其提供给网络,例如:通过将图像缩放到-1..1范围;也就是说,0..1范围大多听起来也很清醒。根据网络中使用的激活,值范围可能会有所不同:例如,ReLU在输入低于零时消失,当值低于-4或高于+4时,sigmoids开始饱和如果没有值低于0,则tanh激活会错过其值范围的一半 - 如果值范围太大,渐变也可能会爆炸,完全阻止训练。从this paper开始,作者似乎减去(批量)图像均值而不是值范围均值。

你也可以尝试使用较小的学习率(虽然我个人经常在0.0001试验Adam)。

至于你的多个会话部分问题:当前在你的代码中实现它的方式,你基本上是混乱的默认图形。致电

for key in finalInputdata.keys():
    for procTypes in range(0,(len(finalInputdata[key]))):
        if np.shape(finalInputdata[key][procTypes][0]) != ():

    # ...

    x = tf.placeholder(tf.float32, (None, 32, 32,np.shape(X_train)[-1]))
    y = tf.placeholder(tf.int32, (None))
    one_hot_y = tf.one_hot(y,43)

    # Tensor Operations
    logits = LeNet(x,np.shape(X_train)[-1])

    # ... etc ...

您正在创建LeNet的len(finalInputdata) * N 个不同的实例,所有这些都在默认图表中。当变量在网络中内部重用时,这可能是一个问题。

如果您确实要重置默认图表以尝试不同的超参数,请尝试

for key in finalInputdata.keys():
    for procTypes in range(0,(len(finalInputdata[key]))):

        tf.reset_default_graph()
        # define the graph

        sess = tf.InteractiveSession()
        # train

但最好明确地创建图形和会话,如下所示:

for key in finalInputdata.keys():
    for procTypes in range(0,(len(finalInputdata[key]))):

        with tf.Graph().as_default() as graph:
            # define the graph

        with tf.Session(graph=graph) as sess:
            # train

您可以直接使用sess = tf.get_default_session()引用,而不是调用sess

我还发现Jupyter内核和支持GPU的TensorFlow在网络上进行迭代时不能很好地协同工作,有时会遇到内存不足错误或者彻底崩溃浏览器选项卡。