Question

我正在使用Tensorflow作为后端处理Keras中的时间序列数据。

我的神经网络输入有问题：

X=pd.concat([X_prices,X_os,X_months,X_wd,X_stock],axis=1)

如果我跑：

print(X_prices.shape,X_os.shape,X_wd.shape,X_months.shape,X_stock.shape)
print(X.shape)

我明白了：

((729, 10), (729, 1), (729, 6), (729, 11), (729, 10))
((729,38))

不幸的是，在追加滞后时间序列时：

X=pd.concat([X_prices,X_os,X_months,X_wd,X_stock,X_lag1],axis=1)
print(X_lag1.shape)
print(X.shape)

我明白了：

((729,10))
((1458,48))

基本上我把行数增加了一倍..

我不知道我错过了什么。

感谢您的帮助

Answer 1

如果不仔细查看数据，很难确定。

但是，如果我不得不猜测我会说你的问题是数据帧的索引。让我告诉你一个我的意思的样本：

df_1 = pd.DataFrame(np.random.rand(5), index=np.arange(5))
df_2 = pd.DataFrame(np.random.rand(5), index=np.arange(5))
df_3 = pd.DataFrame(np.random.rand(5), index=np.arange(5)+5)

如果我们尝试连接前两个（相同的索引值）：

pd.concat([df_1,df_2],1)

现在，如果我们尝试将第一个与最后一个（不同的索引值）连接起来：

pd.concat([df_1,df_3],1)

希望有所帮助！

带有滞后输入的pandas.concat

1 个答案: