我有两个数据框,我想沿列加入它们。索引不是唯一的:
df1 = pd.DataFrame({'A': ['0', '1', '2', '2'],'B': ['B0', 'B1', 'B2', 'B3'],'C': ['C0', 'C1', 'C2', 'C3']}):
A B C
0 0 B0 C0
1 1 B1 C1
2 2 B2 C2
3 2 B3 C3
df2 = pd.DataFrame({'A': ['0', '2', '3'],'E': ['E0', 'E1', 'E2']},index=[0, 2, 3])
A E
0 0 E0
1 2 E1
2 3 E2
A应该是我的索引。我想要的是:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NAN
2 2 B2 C2 E1
3 2 B3 C3 E1
这pd.concat([df1, df2], 1)
给了我错误:
Reindexing only valid with uniquely valued Index objects
答案 0 :(得分:4)
也许您正在寻找左外 merge。
df1.merge(df2, how='left')
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E1
答案 1 :(得分:1)
使用combine_first
df1.combine_first(df2).dropna(subset=['A'],axis=0)
Out[320]:
A B C D E
0 A0 B0 C0 D0 E0
1 A1 B1 C1 NaN NaN
2 A2 B2 C2 D1 E1
2 A3 B3 C3 D1 E1
编辑后:
使用combine_first
df1.combine_first(df2.set_index('A'))
Out[338]:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E2
或
pd.concat([df1,df2.set_index('A')],axis=1)
Out[339]:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E2
答案 2 :(得分:0)
沿着列轴与concat
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],'B': ['B0', 'B1', 'B2', 'B3'],'C': ['C0', 'C1', 'C2', 'C3']},index=[0, 1, 2, 2])
df2 = pd.DataFrame({'D': ['D0', 'D1'],'E': ['E0', 'E1']},index=[0, 2])
df = pd.concat([df1, df2], axis=1)
输出:
A B C D E
0 A0 B0 C0 D0 E0
1 A1 B1 C1 NaN NaN
2 A2 B2 C2 D1 E1
2 A3 B3 C3 D1 E1