Pandas使用TimeStamp索引对齐多个数据帧

时间:2014-10-14 16:47:30

标签: python pandas concatenation time-series

这是过去几天我生命中的祸根。我有许多Pandas Dataframes包含不规则频率的时间序列数据。我尝试将这些对齐到一个数据帧中。

以下是一些代码,包含代表性的数据框,df1df2df3(我实际上有n = 5,并希望找到适用于所有{{}的解决方案1}}):

n>2

我知道为什么会出现这个错误,所以我摆脱了# df1, df2, df3 are given at the bottom import pandas as pd import datetime # I can align df1 to df2 easily df1aligned, df2aligned = df1.align(df2) # And then concatenate into a single dataframe combined_1_n_2 = pd.concat([df1aligned, df2aligned], axis =1 ) # Since I don't know any better, I then try to align df3 to combined_1_n_2 manually: combined_1_n_2.align(df3) error: Reindexing only valid with uniquely valued Index objects 中的重复索引,然后再试一次:

combined_1_n_2

为什么我收到此错误?即使这有效,它也完全是手工和丑陋的。如何对齐> 2个时间序列并将它们组合在一个数据帧中?

数据:

combined_1_n_2 = combined_1_n_2.groupby(combined_1_n_2.index).first()
combined_1_n_2.align(df3) # But stll get the same error
error: Reindexing only valid with uniquely valued Index objects

1 个答案:

答案 0 :(得分:6)

您的具体错误是由于combined_1_n_2的列名称有重复(两列都将命名为'price')。您可以重命名列,第二个对齐也可以。

另一种方法是链接join运算符,该运算符合并索引上的帧,如下所示。

In [23]: df1.join(df2, how='outer', rsuffix='_1').join(df3, how='outer', rsuffix='_2')
Out[23]: 
                              price   price_1  price_2
2008-06-01 06:03:52.281000      NaN       NaN  67.6560
2008-06-01 06:03:52.359000      NaN       NaN  67.8750
2008-06-01 06:03:59.614000  62.1250       NaN      NaN
2008-06-01 06:03:59.692000  62.2500       NaN      NaN
2008-06-01 06:13:34.524000      NaN  241.0625      NaN
2008-06-01 06:13:34.602000      NaN  241.5000      NaN
2008-06-01 06:13:34.848000      NaN       NaN  67.8125
2008-06-01 06:13:34.926000      NaN       NaN  67.7500
2008-06-01 06:15:05.321000      NaN       NaN  67.6875
2008-06-01 06:15:05.399000      NaN  241.3750      NaN
2008-06-01 06:15:05.399000      NaN  241.2500      NaN
2008-06-01 06:15:42.004000  62.2375       NaN      NaN
2008-06-01 06:15:42.082000      NaN  241.3750      NaN
2008-06-01 06:15:42.083000  61.9250       NaN      NaN
2008-06-01 06:17:01.654000  61.9125       NaN      NaN