tensorflow:输出层始终显示[1.]

时间:2017-10-10 19:37:28

标签: python machine-learning tensorflow neural-network tensorflow-gpu

这是我训练的歧视性网络,所以我可以在生成网络中使用它。我在具有2个特征的数据集上进行了训练,并进行了二元分类。 1 =冥想0 =不冥想。 (数据集来自siraj raval的视频之一)。

由于某些原因,输出层(ol)在每个测试用例中总是输出[1]。

我的数据集:https://drive.google.com/open?id=0B5DaSp-aTU-KSmZtVmFoc0hRa3c

import pandas as pd
import tensorflow as tf

data = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_train.csv")
data_f = data.drop("lbl", axis = 1)
data_l = data.drop(["f1", "f2"], axis = 1)

learning_rate = 0.01
batch_size = 1
n_epochs = 30
n_examples = 999 # This is highly unsatisfying >:3
n_iteration = int(n_examples/batch_size)


features = tf.placeholder('float', [None, 2], name='features_placeholder')
labels = tf.placeholder('float', [None, 1], name = 'labels_placeholder')

weights = {
            'ol': tf.Variable(tf.random_normal([2, 1], stddev= -12), name = 'w_ol')
}

biases = {
            'ol': tf.Variable(tf.random_normal([1], stddev=-12), name = 'b_ol')
}

ol = tf.nn.sigmoid(tf.add(tf.matmul(features, weights['ol']), biases['ol']), name = 'ol')

loss = -tf.reduce_sum(labels*tf.log(ol), name = 'loss') # cross entropy
train = tf.train.AdamOptimizer(learning_rate).minimize(loss)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

for epoch in range(n_epochs):
    ptr = 0
    for iteration in range(n_iteration):
        epoch_x = data_f[ptr: ptr + batch_size]
        epoch_y = data_l[ptr: ptr + batch_size]
        ptr = ptr + batch_size

        _, err = sess.run([train, loss], feed_dict={features: epoch_x, labels:epoch_y})
    print("Loss @ epoch ", epoch, " = ", err)

print("Testing...\n")

data = pd.read_csv("E:/workspace_py/datasets/simdata/linear_data_eval.csv")
test_data_l = data.drop(["f1", "f2"], axis = 1)
test_data_f = data.drop("lbl", axis = 1)
#vvvHERE    
print(sess.run(ol, feed_dict={features: test_data_f})) #<<<HERE
#^^^HERE
saver = tf.train.Saver()
saver.save(sess, save_path="E:/workspace_py/saved_models/meditation_disciminative_model.ckpt")
sess.close()

输出:

2017-10-11 00:49:47.453721: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-11 00:49:47.454212: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-11 00:49:49.608862: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 960M
major: 5 minor: 0 memoryClockRate (GHz) 1.176
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.35GiB
2017-10-11 00:49:49.609281: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 0 
2017-10-11 00:49:49.609464: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0:   Y 
2017-10-11 00:49:49.609659: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0)
Loss @ epoch  0  =  0.000135789
Loss @ epoch  1  =  4.16049e-05
Loss @ epoch  2  =  1.84776e-05
Loss @ epoch  3  =  9.41758e-06
Loss @ epoch  4  =  5.24522e-06
Loss @ epoch  5  =  2.98024e-06
Loss @ epoch  6  =  1.66893e-06
Loss @ epoch  7  =  1.07288e-06
Loss @ epoch  8  =  5.96047e-07
Loss @ epoch  9  =  3.57628e-07
Loss @ epoch  10  =  2.38419e-07
Loss @ epoch  11  =  1.19209e-07
Loss @ epoch  12  =  1.19209e-07
Loss @ epoch  13  =  1.19209e-07
Loss @ epoch  14  =  -0.0
Loss @ epoch  15  =  -0.0
Loss @ epoch  16  =  -0.0
Loss @ epoch  17  =  -0.0
Loss @ epoch  18  =  -0.0
Loss @ epoch  19  =  -0.0
Loss @ epoch  20  =  -0.0
Loss @ epoch  21  =  -0.0
Loss @ epoch  22  =  -0.0
Loss @ epoch  23  =  -0.0
Loss @ epoch  24  =  -0.0
Loss @ epoch  25  =  -0.0
Loss @ epoch  26  =  -0.0
Loss @ epoch  27  =  -0.0
Loss @ epoch  28  =  -0.0
Loss @ epoch  29  =  -0.0
Testing...

[[ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]]
Saving model...
[Finished in 57.9s]

1 个答案:

答案 0 :(得分:1)

主要问题

首先,这不是有效的交叉熵损失。您使用的等式仅适用于2个或更多输出。只需一个sigmoid输出就可以了

-tf.reduce_sum(labels*tf.log(ol) + (1-labels)*tf.log(1-ol), name = 'loss')

否则最佳解决方案是始终回答&#34; 1&#34; (现在正在发生)。

为什么?

请注意,标签只有0或1,而您的整个损失是预测的标签和对数的乘积。因此,当true label为0时,无论您的预测如何,您的损失都为0,因为0 * log(x)= 0,无论x是什么(只要定义了log(x))。因此,您的模型只会因为没有预测&#34; 1&#34;而受到惩罚。什么时候应该,所以它学会一直输出1。

其他一些奇怪的事情

  1. 你正在为正态分布提供负面的stddev,而你不应该这样做(除非这是random_normal的一些未记录的功能,但根据文档它应该接受一个浮点数,而你应该提供一个小数字。)

  2. 像这样(以天真的方式)计算交叉熵不是数值稳定的,请看一下tf.sigmoid_cross_entropy_with_logits。

  3. 您没有置换您的数据集,因此您始终以相同的顺序处理数据,这可能会产生不良后果(损失的周期性增加,更难收敛或缺乏收敛)。