如何在RNNClassifier中使用tensorflow sequence_numeric_column?

时间:2019-04-05 14:12:53

标签: tensorflow recurrent-neural-network

我正在寻找抛出tensorflow contrib API,我想使用Tensorflow 1.13中可用的RNNClassifier。与非序列估计器相反,这一序列仅需要序列特征列。但是,我无法使其在玩具数据集上运行。使用sequence_numeric_column时,我不断收到错误消息。

这是我的玩具数据集的结构:

idSeq,kind,label,size
0,0,dwarf,117.6
0,0,dwarf,134.4
0,0,dwarf,119.0
0,1,human,168.0
0,1,human,145.25
0,2,elve,153.9
0,2,elve,218.49999999999997
0,2,elve,210.9
1,0,dwarf,166.6
1,0,dwarf,168.0
1,0,dwarf,131.6
1,1,human,150.5
1,1,human,208.25
1,1,human,210.0
1,2,elve,199.5
1,2,elve,161.5
1,2,elve,197.6

其中idSeq允许我们查看哪些行属于哪个序列。 由于“尺寸”列,我想预测“种类”列。

下面是关于在数据集上进行RNN训练的代码。

import numpy as np
import pandas as pd
import tensorflow as tf


os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.INFO)

dataframe = pd.read_csv("data_rnn.csv")
dataframe_test = pd.read_csv("data_rnn_test.csv")


train_x = dataframe
train_y = dataframe.loc[:,(["kind"])]


size_feature_col = tf.contrib.feature_column.sequence_numeric_column('size ')


estimator = tf.contrib.estimator.RNNClassifier(
    sequence_feature_columns=[size_feature_col ],
    num_units=[32, 16],
    cell_type='lstm',
    model_dir=None,
    n_classes=3,
    optimizer='Adagrad'
)



def make_dataset(
    batch_size, 
    x, 
    y=None, 
    shuffle=False, 
    shuffle_buffer_size=1000,
    shuffle_seed=1):
    """
    An input function for training, evaluation or prediction.

    Parameters
    ----------------------
    batch_size: integer
        the size of the batch to use for the training of the neural network
    x: pandas dataframe 
        dataframe that contains the features of the samples to study
    y: pandas dataframe or array (Default: None)
        dataframe or array that contains the values to predict of the samples
        to study. If none, we want a dataset for evaluation or prediction.
    shuffle: boolean (Default: False)
        if True, we shuffle the elements of the dataset
    shuffle_buffer_size: integer (Default: 1000)
        if we shuffle the elements of the dataset, it is the size of the buffer
        used for it.
    shuffle_seed : integer
        the random seed for the shuffling

    Returns
    ---------------------
    dataset.make_one_shot_iterator().get_next(): Tensor
        a nested structure of tf.Tensors containing the next element of the 
        dataset to study
    """

    def input_fn():
        if y is not None:
            dataset = tf.data.Dataset.from_tensor_slices((dict(x), y))
        else:
            dataset = tf.data.Dataset.from_tensor_slices(dict(x))
        if shuffle:
            dataset = dataset.shuffle(
                buffer_size=shuffle_buffer_size,
                seed=shuffle_seed).batch(batch_size).repeat()
        else:
            dataset = dataset.batch(batch_size)
        return dataset.make_one_shot_iterator().get_next()

    return input_fn



batch_size = 50
random_seed = 1


input_fn_train = make_dataset(
            batch_size=batch_size, 
            x=train_x, 
            y=train_y, 
            shuffle=True, 
            shuffle_buffer_size=len(train_x),
            shuffle_seed=random_seed)

estimator.train(input_fn=input_fn_train, steps=5000)

但是我只有以下错误:

INFO:tensorflow:Calling model_fn.
Traceback (most recent call last):
  File "main.py", line 125, in <module>
    estimator.train(input_fn=input_fn_train, steps=5000)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 512, in _model_fn
    config=config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 332, in _rnn_model_fn
    logits, sequence_length_mask = logit_fn(features=features, mode=mode)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 226, in rnn_logit_fn
    features=features, feature_columns=sequence_feature_columns)
  File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 120, in sequence_input_layer
    trainable=trainable)
  File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 496, in _get_sequence_dense_tensor
    sp_tensor, default_value=self.default_value)
  File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 1432, in sparse_tensor_to_dense
    sp_input = _convert_to_sparse_tensor(sp_input)
  File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 68, in _convert_to_sparse_tensor
    raise TypeError("Input must be a SparseTensor.")
TypeError: Input must be a SparseTensor.

所以我不明白自己做错了什么,因为在文档中,我们必须给RNNEstimator一个序列列。他们没有说出张量稀疏的说法。

在此先感谢您的帮助和建议。

0 个答案:

没有答案