我有两个df' s是我在执行2个数据帧组后创建的。我需要加入这两个摘要输出,以检查声称的小时数是否有差异。
df1 = pd.DataFrame({"Week": ["3/30/2018", "3/30/2018", "3/30/2018", "3/23/2018",
"3/23/2018","3/16/2018", "3/16/2018", "3/9/2018",
"3/9/2018"],
"Empl": ["Sam", "John", "Mike", "Sam", "Mike","Sam",
"John", "Mike", "Sam"],
"Hrs": [11, 12, 2, 13, 5, 14, 15, 16, 7]})
df2 = pd.DataFrame({"Week": ["3/30/2018", "3/30/2018", "3/30/2018", "3/23/2018",
"3/16/2018", "3/16/2018", "3/9/2018", "3/9/2018"],
"Empl": ["Sam", "John", "Mike", "Sam", "Mike","Sam", "Mike", "Sam"],
"Hrs": [16, 12, 2, 13, 5, 15, 21, 7]})
gdF1 = df1.groupby(["Week","Empl"])["Hrs"].sum()
gdF2 = df2.groupby(["Week","Empl"])["Hrs"].sum()
# need to join gdF1 and gDF2 on Week and Empl for further comparison.
答案 0 :(得分:0)
在groupby中添加as_index=False
,然后您可以将它们加入进行进一步分析:
gdF1 = df1.groupby(["Week","Empl"],as_index=False)["Hrs"].sum()
gdF2 = df2.groupby(["Week","Empl"],as_index=False)["Hrs"].sum()
or
gdF1 = df1.groupby(["Week","Empl"])["Hrs"].sum().reset_index(drop=True)
gdF2 = df2.groupby(["Week","Empl"])["Hrs"].sum().reset_index(drop=True)
In [6]: gdF1
Out[6]:
Week Empl Hrs
0 3/16/2018 John 15
1 3/16/2018 Sam 14
2 3/23/2018 Mike 5
3 3/23/2018 Sam 13
4 3/30/2018 John 12
5 3/30/2018 Mike 2
6 3/30/2018 Sam 11
7 3/9/2018 Mike 16
8 3/9/2018 Sam 7
In [7]: gdF2
Out[7]:
Week Empl Hrs
0 3/16/2018 Mike 5
1 3/16/2018 Sam 15
2 3/23/2018 Sam 13
3 3/30/2018 John 12
4 3/30/2018 Mike 2
5 3/30/2018 Sam 16
6 3/9/2018 Mike 21
7 3/9/2018 Sam 7