外部加入之前的熊猫分组方式

时间:2018-09-18 17:01:17

标签: pandas join group-by

我有两个具有以下格式的表:

表1:键=日期,索引

    Date      Index  Value1
0   2015-01-01  A   -1.292040
1   2015-04-01  A   0.535893
2   2015-02-01  B   -1.779029
3   2015-06-01  B   1.129317   

表2:键=日期

    Date        Value2
0   2015-01-01  2.637761
1   2015-02-01  -0.496927
2   2015-03-01  0.226914
3   2015-04-01  -2.010917
4   2015-05-01  -1.095533
5   2015-06-01  0.651244
6   2015-07-01  0.036592
7   2015-08-01  0.509352
8   2015-09-01  -0.682297
9   2015-10-01  1.231889
10  2015-11-01  -1.557481
11  2015-12-01  0.332942

Table2有更多行,我想在Date上将Table1加入Table2,以便我可以对Values进行处理。但是,我也想引入索引并为每个索引填写他们没有的所有日期,例如:

结果:

    Date    Index   Value1      Value2
0   2015-01-01  A   -1.292040   2.637761
1   2015-02-01  A   NaN         -0.496927
2   2015-03-01  A   NaN         0.226914
3   2015-04-01  A   0.535893    -2.010917
4   2015-05-01  A   NaN         -1.095533
5   2015-06-01  A   NaN         0.651244
6   2015-07-01  A   NaN         0.036592
7   2015-08-01  A   NaN         0.509352
8   2015-09-01  A   NaN         -0.682297
9   2015-10-01  A   NaN         1.231889
10  2015-11-01  A   NaN         -1.557481
11  2015-12-01  A   NaN         0.332942
.... and so on with Index B 

我想我可以手动将Table1中的每个Index值过滤到Table2中,但是如果我实际上不知道所有索引,那将非常繁琐且麻烦。我本质上想同时做一个“按索引对Table1分组,并在Date上正确连接到Table2”,但是我仍然坚持如何表达这一点。

运行Pandas和Jupyter的最新版本。

编辑:我有一个程序可以填写NaN,所以它们现在不是问题。

1 个答案:

答案 0 :(得分:0)

似乎您想将'Value1'的{​​{1}}与df1上的df2合并,同时将索引分配给每个日期。您可以将'Date'用于列表理解

pd.concat

输出:

import pandas as pd

pd.concat([df2.assign(Index=i).merge(gp, how='left') for i, gp in df1.groupby('Index')],
          ignore_index=True)

通过不指定合并键,它会自动使用列的交集,每个组的交集为 Date Value2 Index Value1 0 2015-01-01 2.637761 A -1.292040 1 2015-02-01 -0.496927 A NaN 2 2015-03-01 0.226914 A NaN 3 2015-04-01 -2.010917 A 0.535893 4 2015-05-01 -1.095533 A NaN 5 2015-06-01 0.651244 A NaN 6 2015-07-01 0.036592 A NaN 7 2015-08-01 0.509352 A NaN 8 2015-09-01 -0.682297 A NaN 9 2015-10-01 1.231889 A NaN 10 2015-11-01 -1.557481 A NaN 11 2015-12-01 0.332942 A NaN 12 2015-01-01 2.637761 B NaN 13 2015-02-01 -0.496927 B -1.779029 14 2015-03-01 0.226914 B NaN 15 2015-04-01 -2.010917 B NaN 16 2015-05-01 -1.095533 B NaN 17 2015-06-01 0.651244 B 1.129317 18 2015-07-01 0.036592 B NaN 19 2015-08-01 0.509352 B NaN 20 2015-09-01 -0.682297 B NaN 21 2015-10-01 1.231889 B NaN 22 2015-11-01 -1.557481 B NaN 23 2015-12-01 0.332942 B NaN