以下是原始数据框
int
之后我就像这样分组
Week_No item_Number Inside__Outside
4 1.2014 3164018114707537 INSIDE
6 1.2014 50010EJ654990 INSIDE
19 1.2014 304400JE130142 INSIDE
29 1.2014 3164018114725810 INSIDE
31 1.2014 3164018114711298 INSIDE
35 1.2014 3164018114707546 OUTSIDE
36 1.2014 3164018114711299 OUTSIDE
41 1.2014 3164018114727381 INSIDE
54 1.2014 50010EJ655470 OUTSIDE
145 1.2014 304400TS135379 INSIDE
之后是一个组合数据帧
df = df.groupby(['Week_No','Inside__Outside']).agg(['count'])
现在有两个数据帧
item_Number
count
Week_No Inside__Outside
1.2014 INSIDE 51
OUTSIDE 8
2.2014 INSIDE 91
OUTSIDE 16
3.2014 INSIDE 92
OUTSIDE 7
4.2014 INSIDE 76
OUTSIDE 5
并且
df1
item_Number
count
Week_No Inside__Outside
1.2015 INSIDE 18
2.2015 INSIDE 48
3.2015 INSIDE 87
4.2015 INSIDE 54
5.2015 INSIDE 61
6.2015 INSIDE 46
7.2015 INSIDE 83
8.2015 INSIDE 41
9.2015 INSIDE 34
现在我想根据周总结。即两个数据帧的输出
df2
item_Number
count
Week_No Inside__Outside
1.2015 OUTSIDE 8
2.2015 OUTSIDE 4
3.2015 OUTSIDE 7
4.2015 OUTSIDE 4
5.2015 OUTSIDE 1
6.2015 OUTSIDE 6
7.2015 OUTSIDE 8
8.2015 OUTSIDE 4
9.2015 OUTSIDE 3
我想先选择数据,然后手动添加它们,但这似乎并不高效。此外,由于这是多级索引,我无法根据Week_no选择数据。另外请不要查看计数列中的绝对数字。我的问题是针对多级索引数据框的操作。
答案 0 :(得分:0)
您必须从索引中删除Inside__Outside
列,因为您没有使用它来加入这两个表。
让我们从您在示例中提供的两个数据框开始:
data_1_df
Out[35]:
item_Number count
Week_No Inside__Outside
1.2015 INSIDE 18
2.2015 INSIDE 48
3.2015 INSIDE 87
4.2015 INSIDE 54
5.2015 INSIDE 61
6.2015 INSIDE 46
7.2015 INSIDE 83
8.2015 INSIDE 41
9.2015 INSIDE 34
和
data_2_df
Out[36]:
item_Number count
Week_No Inside__Outside
1.2015 OUTSIDE 8
2.2015 OUTSIDE 4
3.2015 OUTSIDE 7
4.2015 OUTSIDE 4
5.2015 OUTSIDE 1
6.2015 OUTSIDE 6
7.2015 OUTSIDE 8
8.2015 OUTSIDE 4
9.2015 OUTSIDE 3
您可以将它们叠加在另一个上,Week_No
上的组和item_Number count
上的总和:
data_3_df = (
pd.concat([data_1_df, data_2_df])
.reset_index()
.groupby('Week_No')
.agg({'item_Number count': sum}
)
这为INSIDE
和OUTSIDE
data_3_df
Out[52]:
item_Number count
Week_No
1.2015 26
2.2015 52
3.2015 94
4.2015 58
5.2015 62
6.2015 52
7.2015 91
8.2015 45
9.2015 37
答案 1 :(得分:0)
只需将它们附加在一起并按第一级分组即可 -
In [118]: df1
Out[118]:
item_Number
count
Week_No Inside__Outside
1.2015 INSIDE 18
2.2015 INSIDE 48
3.2015 INSIDE 87
4.2015 INSIDE 54
5.2015 INSIDE 61
6.2015 INSIDE 46
7.2015 INSIDE 83
8.2015 INSIDE 41
9.2015 INSIDE 34
In [119]: df2
Out[119]:
item_Number
count
Week_No Inside__Outside
1.2015 OUTSIDE 8
2.2015 OUTSIDE 4
3.2015 OUTSIDE 7
4.2015 OUTSIDE 4
5.2015 OUTSIDE 1
6.2015 OUTSIDE 6
7.2015 OUTSIDE 8
8.2015 OUTSIDE 4
9.2015 OUTSIDE 3
In [120]: df1.append(df2).groupby(level=0).sum()
Out[120]:
item_Number
count
Week_No
1.2015 26
2.2015 52
3.2015 94
4.2015 58
5.2015 62
6.2015 52
7.2015 91
8.2015 45
9.2015 37