Python Pandas合并了2个数据框

时间:2018-08-16 18:11:55

标签: python pandas dataframe merge

我正在尝试合并2个具有相同信息但分解方式不同的数据框

df1:#net团队级别的总积分

Team    Current Sales    Previous Sales    Team Total Diff
Blue    10               5                 5
Orange  20               8                 12
Yellow  40               11                29

df2:#net总数按地区细分

Team    Region    Curr Sales    Prev Sales    Net Diff
Blue    East      4             4             0
Blue    West      6             1             5
Orange  East      6             3             3
Orange  West      14            5             9
Yellow  East      15            3             12
Yellow  West      25            8             17

合并数据框:

Team    Region    Curr Sales    Previ Sales    Net Diff   Team Total Diff
Blue    East      4             4              0           5
Blue    West      6             1              5           5
Orange  East      6             3              3           12
Orange  West      14            5              9           12 
Yellow  East      15            3              12          29
Yellow  West      25            8              17          29

我正在这样做,所以我可以在新列中执行其他统计功能,但是我不确定如何将两者合并。如果我将df1 ['Team Total Diff']添加到df2,则它会填充前3条记录,并且不会填写每个团队的名称。

如果我使用以下合并功能,则看不到任何变化:

df2.merge(df1[['team_sort', 'Team']], how='inner', on='Team')

'team_sort'用作索引,以保持基于Net Team Diff升序排列的团队

任何帮助将不胜感激

4 个答案:

答案 0 :(得分:2)

您可以在此情景中使用map

df2['Team Total Diff'] = df2['Team'].map(df1.set_index('Team')['Team Total Diff'])
df2

输出:

     Team Region  Curr Sales  Prev Sales  Net Diff  Team Total Diff
0    Blue   East           4           4         0                5
1    Blue   West           6           1         5                5
2  Orange   East           6           3         3               12
3  Orange   West          14           5         9               12
4  Yellow   East          15           3        12               29
5  Yellow   West          25           8        17               29

答案 1 :(得分:1)

merge是正确的方法,但是您使用的方法不正确。试试看:

merged_df = df2.merge(df1[['Team', 'Team Total Diff']], on=['Team'])

这是因为mergeDataFrame的大多数方法一样,实际上产生了一个新的DataFrame对象,而不是更改self

在处理索引方面可能会有些棘手,因此我通常只在合并数据帧之前重置索引。

答案 2 :(得分:-1)

我认为应该这样做:

merged_df = pd.merge(df1, df2, how=right, left_on="Team", right_on="Team")

答案 3 :(得分:-1)

merged_df = pd.concat([df1,df2], join='inner')

join的默认值是外部的,因此请尝试inner。如果这样做不起作用,请outer

merged_df = pd.concat([df1,df2], join='outer')