我正在寻找抛出tensorflow contrib API,我想使用Tensorflow 1.13中可用的RNNClassifier。与非序列估计器相反,这一序列仅需要序列特征列。但是,我无法使其在玩具数据集上运行。使用sequence_numeric_column时,我不断收到错误消息。
这是我的玩具数据集的结构:
idSeq,kind,label,size
0,0,dwarf,117.6
0,0,dwarf,134.4
0,0,dwarf,119.0
0,1,human,168.0
0,1,human,145.25
0,2,elve,153.9
0,2,elve,218.49999999999997
0,2,elve,210.9
1,0,dwarf,166.6
1,0,dwarf,168.0
1,0,dwarf,131.6
1,1,human,150.5
1,1,human,208.25
1,1,human,210.0
1,2,elve,199.5
1,2,elve,161.5
1,2,elve,197.6
其中idSeq允许我们查看哪些行属于哪个序列。 由于“尺寸”列,我想预测“种类”列。
下面是关于在数据集上进行RNN训练的代码。
import numpy as np
import pandas as pd
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.INFO)
dataframe = pd.read_csv("data_rnn.csv")
dataframe_test = pd.read_csv("data_rnn_test.csv")
train_x = dataframe
train_y = dataframe.loc[:,(["kind"])]
size_feature_col = tf.contrib.feature_column.sequence_numeric_column('size ')
estimator = tf.contrib.estimator.RNNClassifier(
sequence_feature_columns=[size_feature_col ],
num_units=[32, 16],
cell_type='lstm',
model_dir=None,
n_classes=3,
optimizer='Adagrad'
)
def make_dataset(
batch_size,
x,
y=None,
shuffle=False,
shuffle_buffer_size=1000,
shuffle_seed=1):
"""
An input function for training, evaluation or prediction.
Parameters
----------------------
batch_size: integer
the size of the batch to use for the training of the neural network
x: pandas dataframe
dataframe that contains the features of the samples to study
y: pandas dataframe or array (Default: None)
dataframe or array that contains the values to predict of the samples
to study. If none, we want a dataset for evaluation or prediction.
shuffle: boolean (Default: False)
if True, we shuffle the elements of the dataset
shuffle_buffer_size: integer (Default: 1000)
if we shuffle the elements of the dataset, it is the size of the buffer
used for it.
shuffle_seed : integer
the random seed for the shuffling
Returns
---------------------
dataset.make_one_shot_iterator().get_next(): Tensor
a nested structure of tf.Tensors containing the next element of the
dataset to study
"""
def input_fn():
if y is not None:
dataset = tf.data.Dataset.from_tensor_slices((dict(x), y))
else:
dataset = tf.data.Dataset.from_tensor_slices(dict(x))
if shuffle:
dataset = dataset.shuffle(
buffer_size=shuffle_buffer_size,
seed=shuffle_seed).batch(batch_size).repeat()
else:
dataset = dataset.batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
return input_fn
batch_size = 50
random_seed = 1
input_fn_train = make_dataset(
batch_size=batch_size,
x=train_x,
y=train_y,
shuffle=True,
shuffle_buffer_size=len(train_x),
shuffle_seed=random_seed)
estimator.train(input_fn=input_fn_train, steps=5000)
但是我只有以下错误:
INFO:tensorflow:Calling model_fn.
Traceback (most recent call last):
File "main.py", line 125, in <module>
estimator.train(input_fn=input_fn_train, steps=5000)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 512, in _model_fn
config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 332, in _rnn_model_fn
logits, sequence_length_mask = logit_fn(features=features, mode=mode)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 226, in rnn_logit_fn
features=features, feature_columns=sequence_feature_columns)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 120, in sequence_input_layer
trainable=trainable)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 496, in _get_sequence_dense_tensor
sp_tensor, default_value=self.default_value)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 1432, in sparse_tensor_to_dense
sp_input = _convert_to_sparse_tensor(sp_input)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 68, in _convert_to_sparse_tensor
raise TypeError("Input must be a SparseTensor.")
TypeError: Input must be a SparseTensor.
所以我不明白自己做错了什么,因为在文档中,我们必须给RNNEstimator一个序列列。他们没有说出张量稀疏的说法。
在此先感谢您的帮助和建议。