python LSTM数组

时间:2018-01-02 23:27:19

标签: python tensorflow neural-network anaconda lstm

此代码的目的是预测一般货币的外汇市场的未来价值。

首先,我建立了一个统一的数据框架,这个统一的数据框架是最广泛交易货币的11个外汇市场数据集和一组900个经济指标的组合。

在统一数据框中将这911个数据集合在一起,经过测试后没有出现任何问题,我开始使用LSTM神经网络,我只用一个数据集进行测试,效果很好。

当我将统一数据框与LSTM神经网络结合起来时,问题就开始了。

以下是代码:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.models import Sequential
import os

os.chdir("E:\Business\Stocks")
path = os.listdir("E:\Business\Stocks")
for file in path:
    name, ext = os.path.splitext(str(file))
    column_names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
    df1 = pd.read_csv(file, names=column_names, parse_dates={'DateTime': ['Date', 'Time']}, index_col=[0])
    df1 = df1.rename(columns={'Open': name + ' ' + 'Open', 'High': name + ' ' + 'High',
                              'Low': name + ' ' + 'Low', 'Close': name + ' ' + 'Close',
                              'Volume': name + ' ' + 'Volume'})


os.chdir("E:\Business\Economic Indicators")
path = os.listdir("E:\Business\Economic Indicators")
for file in path:
    df2 = pd.read_csv(file, index_col=[0], parse_dates=[0])
    name, ext1 = os.path.splitext(file)
    df2 = df2.rename(columns={'Actual': name + ' ' + 'Actual', 'Consensus': name + ' ' + 'Consensus',
                              'Previous': name + ' ' + 'Previous', 'Revised': name + ' ' + 'Revised'})


dfs = [df1 ,df2]
df = pd.concat(dfs, axis=1, join='inner').sort_index(ascending=False)
df.fillna(method='ffill', inplace=True)

sequence_length = 120
n_features = len(df.columns)
val_ratio = 0.1
n_epochs = 3000
batch_size = 500

data = df.as_matrix()
data_processed = []
for index in range(len(data) - sequence_length):
    data_processed.append(data[index: index + sequence_length])
data_processed = np.array(data_processed)

val_split = round((1 - val_ratio) * data_processed.shape[0])
train = data_processed[: int(val_split), :]
val = data_processed[int(val_split):, :]

print('Training data: {}'.format(train.shape))
print('Validation data: {}'.format(val.shape))

train_samples, train_nx, train_ny = train.shape
val_samples, val_nx, val_ny = val.shape

train = train.reshape((train_samples, train_nx * train_ny))
val = val.reshape((val_samples, val_nx * val_ny))

preprocessor = MinMaxScaler().fit(train)
train = preprocessor.transform(train)
val = preprocessor.transform(val)

train = train.reshape((train_samples, train_nx, train_ny))
val = val.reshape((val_samples, val_nx, val_ny))

X_train = train[:, : -1]
y_train = train[:, -1][:, -1]
X_val = val[:, : -1]
y_val = val[:, -1][:, -1]

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], n_features))
X_val = np.reshape(X_val, (X_val.shape[0], X_val.shape[1], n_features))

model = Sequential()
model.add(LSTM(input_shape=(X_train.shape[1:]), units=100, return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(100, return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(units=1))
model.add(Activation("relu"))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mse', 'accuracy'])

history = model.fit(
    X_train,
    y_train,
    batch_size=batch_size,
    epochs=n_epochs,
    verbose=2)

preds_val = model.predict(X_val)
diff = []
for i in range(len(y_val)):
    pred = preds_val[i][0]
    diff.append(y_val[i] - pred)

real_min = preprocessor.data_min_[104]
real_max = preprocessor.data_max_[104]
print(preprocessor.data_min_[:1])
print(preprocessor.data_max_[:1])

preds_real = preds_val * (real_max - real_min) + real_min
y_val_real = y_val * (real_max - real_min) + real_min

plt.plot(preds_real, label='Predictions')
plt.plot(y_val_real, label='Actual values')
plt.xlabel('test')
plt.legend(loc=0)
plt.show()
print(model.summary())

这是错误:

使用TensorFlow后端。

追踪(最近一次呼叫最后一次):

文件“E:/Tutorial/new.py”,第47行,在train = data_processed [:int(val_split),:]

IndexError:数组索引太多

1 个答案:

答案 0 :(得分:0)

您没有正确切片data_processed。我认为您在int(val_split):之间有多余的空间。我不确定你的意图是什么,但this answer应涵盖可能的情景。