我有2个数据框
df1
Cols/Rows A B C
0 A 50 150 200
1 B 200 250 300
2 C 350 400 450
df2
Cols/Rows A B C
0 A 50 150 200
1 B 200 300 300
2 C 370 400 450
我的预期输出
Cols/Rows A A2 B B2 C C2
0 A 50 50 150 150 200 200
1 B 200 200 250 300 300 300
2 C 350 370 400 400 450 450
我想创建新的数据框,将col和row合并。我尝试使用merge()
,但没有成功
print(df2.merge(df1, how='left'))
答案 0 :(得分:5)
merge
有suffixes
df1.merge(df2,on='Cols/Rows',suffixes =['','2'],how='left')
Out[225]:
Cols/Rows A B C A2 B2 C2
0 A 50 150 200 50 150 200
1 B 200 250 300 200 300 300
2 C 350 400 450 370 400 450
答案 1 :(得分:4)
首先使用带有左侧联接和参数suffixes
的{{3}},然后使用merge
的更改顺序列名称:
df = df2.merge(df1, how='left', on='Cols/Rows', suffixes=['','2'])
print (df)
Cols/Rows A B C A2 B2 C2
0 A 50 150 200 50 150 200
1 B 200 300 300 200 250 300
2 C 370 400 450 350 400 450
def mygen(lst):
for item in lst:
yield item
yield item + '2'
#first column removed by indexing
cols = ['Cols/Rows'] + list(mygen(df1.columns[1:]))
df = df[cols]
print (df)
Cols/Rows A A2 B B2 C C2
0 A 50 50 150 150 200 200
1 B 200 200 300 250 300 300
2 C 370 350 400 400 450 450
如果需要添加差异,最后最好使用新列更改this perfect solution,因为需要减去按第一个列对齐的列:
df1 = df1.set_index('Cols/Rows')
df2 = df2.set_index('Cols/Rows')
df3 = df2.sub(df1)
df = df2.join(df1.add_suffix(2)).join(df3.add_suffix(3))
print (df)
A B C A2 B2 C2 A3 B3 C3
Cols/Rows
A 50 150 200 50 150 200 0 0 0
B 200 300 300 200 250 300 0 50 0
C 370 400 450 350 400 450 20 0 0
def mygen(lst):
for item in lst:
yield item
yield item + '2'
yield item + '3'
df = df[list(mygen(df1.columns))].reset_index()
print (df)
Cols/Rows A A2 A3 B B2 B3 C C2 C3
0 A 50 50 0 150 150 0 200 200 0
1 B 200 200 0 300 250 50 300 300 0
2 C 370 350 20 400 400 0 450 450 0
答案 2 :(得分:4)
您可以对齐索引并使用pd.DataFrame.join
:
res = df1.set_index('Cols/Rows')\
.join(df2.set_index('Cols/Rows').add_suffix(2))
print(res)
A B C A2 B2 C2
Cols/Rows
A 50 150 200 50 150 200
B 200 250 300 200 300 300
C 350 400 450 370 400 450
使用reset_index
作为将索引提升为序列的最后一步。