Tensorflow - 使用日期时间

时间:2017-11-21 08:27:56

标签: tensorflow

我是Tensorflow的菜鸟,我从一些时间序列预测示例开始。

我想导入确切的日期时间而不是下面代码的序列号。怎么做?感谢。

代码:

csv_file_name = './data/sales.csv'
reader = tf.contrib.timeseries.CSVReader(csv_file_name)
train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(reader, batch_size=16, window_size=42)
with tf.Session() as sess:
    data = reader.read_full()
    coord = tf.train.Coordinator()
    tf.train.start_queue_runners(sess=sess, coord=coord)
    data = sess.run(data)
    coord.request_stop()

ar = tf.contrib.timeseries.ARRegressor(
    periodicities=100, input_window_size=35, output_window_size=7,
    num_features=1,
    loss=tf.contrib.timeseries.ARModel.NORMAL_LIKELIHOOD_LOSS)

ar.train(input_fn=train_input_fn, steps=6000)

evaluation_input_fn = tf.contrib.timeseries.WholeDatasetInputFn(reader)
evaluation = ar.evaluate(input_fn=evaluation_input_fn, steps=1)

(predictions,) = tuple(ar.predict(
    input_fn=tf.contrib.timeseries.predict_continuation_input_fn(
        evaluation, steps=100)))

sales.csv

1,12223696.5
2,14098603
3,10515241
4,6328012
5,7200172
6,7864498
7,8036747.5
8,7537712.5
9,15359748.5
10,10074294.5

如果我尝试导入日期时间错误

tensorflow.python.framework.errors_impl.InvalidArgumentError: Field 0 in record 0 is not a valid int64: 2017-01-01

1 个答案:

答案 0 :(得分:1)

根据source codeRandomWindowInputFn接受CSVReaderNumpyReader。因此,您可以使用pandas来阅读CSV,进行日期解析,然后将转换后的日期提供给NumpyReader

我的时间序列数据如下所示

timestamp   value
0   2014-02-14 14:30:00 0.132
1   2014-02-14 14:35:00 0.134
2   2014-02-14 14:40:00 0.134
3   2014-02-14 14:45:00 0.134
4   2014-02-14 14:50:00 0.134

首先,我使用pandas

将timestamp列解析为int col
from datetime import datetime as dt
import pandas as pd

def date_parser(date_str):
    return dt.strptime(date_str, "%Y-%m-%d %H:%M:%S").strftime("%s")

data = pd.read_csv("my_data.csv"
                   , header=0
                   , parse_dates=['timestamp']
                   , date_parser=date_parser)

data['timestamp'] = data['timestamp'].apply(lambda x: int(x))

然后我们可以将这些数组传递给NumpyReader

np_reader = tf.contrib.timeseries.NumpyReader(data={tf.contrib.timeseries.TrainEvalFeatures.TIMES: data['timestamp'].values, tf.contrib.timeseries.TrainEvalFeatures.VALUES : data['value'].values})

最后将np_reader传递给RandomWindowInputFn

train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(
      np_reader, batch_size=32, window_size=16)

希望这有助于某人!