如何水平连接2个数据帧(行和逐行)?

时间:2018-06-26 14:08:37

标签: python pandas dataframe

我有2个数据框

df1

  Cols/Rows   A    B    C
0         A   50  150  200
1         B  200  250  300
2         C  350  400  450

df2

  Cols/Rows    A    B    C
0         A   50  150  200
1         B  200  300  300
2         C  370  400  450

我的预期输出

  Cols/Rows    A    A2    B     B2    C    C2
0         A   50    50   150    150  200   200
1         B  200    200  250    300  300   300
2         C  350    370  400    400  450   450

我想创建新的数据框,将col和row合并。我尝试使用merge(),但没有成功

print(df2.merge(df1, how='left'))

3 个答案:

答案 0 :(得分:5)

mergesuffixes

df1.merge(df2,on='Cols/Rows',suffixes =['','2'],how='left')
Out[225]: 
  Cols/Rows    A    B    C   A2   B2   C2
0         A   50  150  200   50  150  200
1         B  200  250  300  200  300  300
2         C  350  400  450  370  400  450

答案 1 :(得分:4)

首先使用带有左侧联接和参数suffixes的{​​{3}},然后使用merge的更改顺序列名称:

df = df2.merge(df1, how='left', on='Cols/Rows', suffixes=['','2'])
print (df)
  Cols/Rows    A    B    C   A2   B2   C2
0         A   50  150  200   50  150  200
1         B  200  300  300  200  250  300
2         C  370  400  450  350  400  450

def mygen(lst):
    for item in lst:
        yield item
        yield item + '2'

#first column removed by indexing
cols = ['Cols/Rows'] + list(mygen(df1.columns[1:]))
df = df[cols]
print (df)
  Cols/Rows    A   A2    B   B2    C   C2
0         A   50   50  150  150  200  200
1         B  200  200  300  250  300  300
2         C  370  350  400  400  450  450

如果需要添加差异,最后最好使用新列更改this perfect solution,因为需要减去按第一个列对齐的列:

df1 = df1.set_index('Cols/Rows')
df2 = df2.set_index('Cols/Rows')
df3 = df2.sub(df1)

df = df2.join(df1.add_suffix(2)).join(df3.add_suffix(3))
print (df)
             A    B    C   A2   B2   C2  A3  B3  C3
Cols/Rows                                          
A           50  150  200   50  150  200   0   0   0
B          200  300  300  200  250  300   0  50   0
C          370  400  450  350  400  450  20   0   0

def mygen(lst):
    for item in lst:
        yield item
        yield item + '2'
        yield item + '3'

df = df[list(mygen(df1.columns))].reset_index()
print (df)
  Cols/Rows    A   A2  A3    B   B2  B3    C   C2  C3
0         A   50   50   0  150  150   0  200  200   0
1         B  200  200   0  300  250  50  300  300   0
2         C  370  350  20  400  400   0  450  450   0

答案 2 :(得分:4)

您可以对齐索引并使用pd.DataFrame.join

res = df1.set_index('Cols/Rows')\
         .join(df2.set_index('Cols/Rows').add_suffix(2))

print(res)

             A    B    C   A2   B2   C2
Cols/Rows                              
A           50  150  200   50  150  200
B          200  250  300  200  300  300
C          350  400  450  370  400  450

使用reset_index作为将索引提升为序列的最后一步。