Tensorflow:每个测试数据的多标签分类预测是相同的

时间:2018-05-12 06:55:15

标签: python tensorflow deep-learning lstm rnn

我正在尝试使用多标签分类问题Dataset is Available here

所以我将LSTM RNN的输入转换为:

原始数据是:

[-0.106902 -0.111342  0.104265  0.114448  0.067026  0.040118  0.018003
 -0.082054 -0.092087 -0.192697 -0.026802  0.215549  0.344768  0.324198
  0.200254  0.234357 -0.040812  0.025356 -0.193163 -0.019159 -0.051112
  0.070979  0.020293  0.075366  0.126615  0.091983  0.138466  0.23322
  0.024106  0.069623  0.043408  0.107059 -0.072603  0.022784  0.063041
  0.089568 -0.088068 -0.10704  -0.061862 -0.008561  0.036751 -0.052483
 -0.171235 -0.135565  0.045164 -0.12917  -0.115914 -0.105413  0.005252
 -0.06102  -0.057999 -0.064665 -0.072545  0.021969 -0.045153  0.019881
  0.022636 -0.007741  0.076754 -0.03363  -0.000429  0.115502  0.139804
  0.102889 -0.158891 -0.094767  0.046051  0.147124  0.078688 -0.063363
 -0.024232  0.050911  0.018356 -0.016907 -0.017603 -0.037143 -0.021808
 -0.148908 -0.001696  0.003607 -0.028734 -0.074155 -0.07131  -0.033052
  0.051065  0.085901  0.037884  0.076677 -0.004175  0.024224  0.00108
 -0.03285  -0.067774 -0.021328 -0.038708 -0.02537  -0.053335  0.015339
 -0.014152  0.024729 -0.052682 -0.016872  0.090514]

我为RNN LSTM转换为3 dim,如下所示:

   [[[-0.072794], [0.181316], [0.014368], [0.028411], [-0.041242], [-0.004056], [-0.064594], 
     [0.003051], [0.055096], [-0.114891], [0.067934], [0.037837], [0.025255], [0.050971], 
     [0.075224], [0.018362], [-0.104191], [-0.110567], [-0.027323], [0.059402], [0.081574], 
     [-0.023793], [-0.064557], [-0.027703], [-0.025198], [-0.016347], [0.029568], [-0.061661], 
     [-0.092653], [-0.186273], [-0.041202], [0.038554], [-0.059853], [0.123145], [-0.096088], 
     [-0.282818], [-0.125915], [0.204784], [-0.178102], [0.173425], [-0.10509], [-0.223132], 
     [-0.115442], [0.028586], [-0.102809], [-0.168281], [-0.029156], [-0.16269], [0.205518], 
     [0.058809], [-0.036977], [-0.00827], [0.037344], [0.086508], [-0.070408], [-0.106666], 
     [0.067168], [0.009743], [-0.006985], [0.116635], [0.087596], [0.066868], [0.096816], 
     [0.116658], [0.00165], [-0.079719], [0.015966], [0.057896], [-0.092253], [-0.009542], 
     [0.005439], [0.162932], [-0.206875], [0.119895], [0.007899], [-9.6e-05], [-0.253397], 
     [0.0976], [0.131022], [0.07027], [-0.057863], [-0.075103], [-0.021241], [-0.057738], 
     [-0.046753], [0.096566], [-0.0508], [0.122675], [-0.062557], [0.030779], [-0.034159], 
     [-0.05235], [-0.06705], [0.165413], [-0.05623], [0.181517], [-0.056385], [-0.002522], 
     [-0.049523], [-0.067518], [-0.062527], [-0.027574], [0.075115]]]

标签是:

[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0]

现在我的模型是:(这是一个简单的rnn lstm模型)

import tensorflow as tf
from tensorflow.contrib import rnn
import numpy as np
import data_preprocessing





batch=100
iteration=int(2175//100)  #total dataset//batch_size
epoch=20

class RNNLSTM():

    def __init__(self):

        tf.reset_default_graph()

        input_x = tf.placeholder(dtype=tf.float32,name='input',shape=[None,103,1])  #batch_size x seq_lenth x dim

        labels_o = tf.placeholder(dtype=tf.float32,name='labels',shape=[None,14])     #batch_size x labels

        self.placeholder={'input':input_x,'output':labels_o}

        with tf.variable_scope('encoder') as scope:

            cell=rnn.LSTMCell(num_units=100)

            dropout_wrapper=rnn.DropoutWrapper(cell,output_keep_prob=0.5)

            model,(fs,fw)=tf.nn.dynamic_rnn(dropout_wrapper,dtype=tf.float32,inputs=input_x)

        batch_major = tf.transpose(model,[1,0,2])

        weights=tf.get_variable(name='weights',shape=[100,14],initializer=tf.random_uniform_initializer(-0.01,0.01),dtype=tf.float32)

        bias   = tf.get_variable(name='bias',shape=[14],initializer=tf.random_uniform_initializer(-0.01,0.01),dtype=tf.float32)

        #logits
        logits= tf.matmul(batch_major[-1],weights) + bias

        #passing the logits to sigmoid for normalization
        pred=tf.round(tf.nn.sigmoid(logits))

        #accuracy calculation
        accuracy = tf.equal(pred,labels_o)

        #cross entropy
        ce=tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,labels=labels_o)

        #calculating the loss
        loss=tf.reduce_mean(ce)

        #claculating accuracy
        accuracy1 = tf.reduce_mean(tf.cast(accuracy, tf.float32))

        #training default learning rate is 0.001
        train=tf.train.AdamOptimizer().minimize(loss)

        self.out={'accuracy':accuracy1,'pred':accuracy,'prob':pred,'loss':loss,'train':train,'logits':logits}

        self.test={'pred':pred}



def execute_model(model):
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        for i in range(epoch):
            for j in range(iteration):

                datain=data_preprocessing.get_train_data()['input']
                labels=data_preprocessing.get_train_data()['labels']
                fina_out=sess.run(model.out,feed_dict={model.placeholder['input']:datain,model.placeholder['output']:labels})

                print('epoch', i, 'iteration', j, 'loss', fina_out['loss'],'accuracy', fina_out['accuracy'])


        print("Now testing the model with test data..")

        for i in range(30):
            data_test = data_preprocessing.get_test_data()['input']
            labels = data_preprocessing.get_test_data()['labels']

            outputp = sess.run(model.test,
                               feed_dict={model.placeholder['input']: data_test})

            print(outputp['pred'], 'vs', labels)



if '__main__'==__name__:

    result=RNNLSTM()
    execute_model(result)

即使在20个纪元之后,模型对测试数据给出了相同的结果,我试图在网上找到并且有人建议如果结果相同则增加批量大小,我做了50到100批量但是结果仍然相同,我想我可能在某个地方犯了错误,可能在损失计算或任何地方,请指出错误,

输出

epoch 0 iteration 0 loss 0.6922738 accuracy 0.595
epoch 0 iteration 1 loss 0.69211155 accuracy 0.57928574
epoch 0 iteration 2 loss 0.6916339 accuracy 0.61071426
epoch 0 iteration 3 loss 0.6909899 accuracy 0.73
epoch 0 iteration 4 loss 0.69043064 accuracy 0.7171429
....
....
....

epoch 19 iteration 15 loss 0.4839307 accuracy 0.77428573
epoch 19 iteration 16 loss 0.49799272 accuracy 0.76857144
epoch 19 iteration 17 loss 0.49267265 accuracy 0.7714286
epoch 19 iteration 18 loss 0.5134562 accuracy 0.7614286
epoch 19 iteration 19 loss 0.5096274 accuracy 0.76857144
epoch 19 iteration 20 loss 0.48447722 accuracy 0.77

预测:

Predicted output                              vs      real output 
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 1 1 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 0 1 1 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 1 1 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 0 0 0 0 0 1 1 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 0 0 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 1 0 0 0 0 0 0 1 1 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 1 1 0 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 1 1 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 1]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 1]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 1 1 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 0 0 0 0 0 0 1 1 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 1 1 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 1 1 0 0 0 0 0 1 1 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 1 1 0 0 0 0 1]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 1 1 0 0 1 1 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 0 0 0 1 1 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 1 1 1 1 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 0 0 0 0 0 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 0 0 0 0 0 1 1 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 1 1 0 0 0 0 0 0 0 0 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 0 0 0 1 1 1 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [1 1 1 0 0 0 0 0 0 0 0 1 1 0]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.]] vs [0 0 0 0 0 0 0 1 1 0 0 0 0 0]

2 个答案:

答案 0 :(得分:0)

上面的整个网络只包含1 LSTM cell(和密集层)。

在您定义的bi-directional LSTM中,您为两个方向共享相同的LSTM单元格。您需要定义正向LSTM和向后方向LSTM,它们不应该共享权重。

您可以检查图形变量并确定网络是否已正确创建。使用:

 for v in tf.global_variables():
    print(v.name) 

答案 1 :(得分:0)

我也有同样的问题。训练期间,不同输入的RNN或LSTM输出相同。

我的解决方法是:

将RNN或LSTM单元中的激活功能更改为ReLU或其他功能,而不是默认的tanh。