Keras中单变量时间序列的样本外预测/预测

时间:2018-10-16 10:10:50

标签: python neural-network keras forecasting

我正在使用this Kaggle guide进行时间序列预测(附加了示例数据)。

代码如下:

def create_dataset(dataset, window_size = 1):
    data_X, data_Y = [], []
    for i in range(len(dataset) - window_size - 1):
        a = dataset[i:(i + window_size), 0]
        data_X.append(a)
        data_Y.append(dataset[i + window_size, 0])
    return(np.array(data_X), np.array(data_Y))   

def fit_model(train_X, train_Y, window_size = 1):
    model = Sequential()

    model.add(LSTM(4, 
               input_shape = (1, window_size)))
    model.add(Dense(1))
    model.compile(loss = "mean_squared_error", 
              optimizer = "adam")
    model.fit(train_X, 
          train_Y, 
          epochs = 100, 
          batch_size = 1, 
          verbose = 0)

    return(model)

def predict_and_score(model, X, Y):
# Make predictions on the original scale of the data.
    pred = MinMaxScaler(feature_range = (0,1)).inverse_transform(model.predict(X))
# Prepare Y data to also be on the original scale for interpretability.
    orig_data = MinMaxScaler(feature_range = (0,1)).inverse_transform([Y])
# Calculate RMSE.

    score = math.sqrt(mean_squared_error(orig_data[0], pred[:, 0]))
    return(score, pred)

这整个东西都在以下功能中使用:

def nnet(time_series, window_size=1,  ): 
    cmi_total_raw = vstack((time_series.values.astype('float32')))
    scaler = MinMaxScaler(feature_range = (0,1))
    cmi_total_scaled = scaler.fit_transform(cmi_total_raw)
    cmi_train_sc = (cmi_total_scaled[0:int(cmi_split*len(cmi_total_scaled))])
    cmi_test_sc = cmi_total_scaled[int(cmi_split*len(cmi_total_scaled)) : len(cmi_total_scaled)] 


    # Create test and training sets for one-step-ahead regression.
    window_size = 1
    train_X, train_Y = create_dataset(cmi_train_sc, window_size)
    test_X, test_Y = create_dataset(cmi_test_sc, window_size)

    # Reshape the input data into appropriate form for Keras.
    train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
    test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))

    model = fit_model(train_X, train_Y, window_size)

    rmse_train, train_predict = predict_and_score(nn_model, train_X, train_Y)

    mape_test, test_predict = predict_and_score(model, test_X, test_Y)    
    return (mape_test, test_predict)   

据我了解,它正在基于训练数据创建模型并根据样本内测试集进行预测,最后计算出误差。

输入数据有209行,我想预测下一行。

这是我尝试过的:

由于使用forecast(steps= n_steps)方法在Auto-Arima中完成了相同的操作,因此我在Keras中寻找了类似的东西。

来自Keras documentation

predict(x, batch_size=None, verbose=0, steps=None)

参数:

x: The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs).

steps: Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None.

我尝试更改step,它预测非常荒唐的值100,000。而且,test_predict的长度离我给的steps还远。因此,我假设step在这里还有其他含义。

问题 -Keras甚至可以用来预测时间序列数据(样本外) -如果是,是否有forecast方法和上述predict方法一样? -如果否,是否可以以任何方式使用现有的predict方法来摆脱样本预测?

样本数据(总cmi _):

2014-05-25    272.459887
2014-06-01    272.446022
2014-06-08    330.301260
2014-06-15    656.838394
2014-06-22    670.575110

0 个答案:

没有答案