Question

我有两个具有相同索引的DataFrame。我想要一列DataFrame＆＃39; B＆＃39;要合并到DataFrame＆＃39; A＆＃39;。标准pd.merge(A, B)似乎没有做我想要的，因为它将B中的所有列合并到A中。pd.merge(A, B['my column'])无法正常工作，因为它抱怨第二个参数是一个没有＆＃39的系列; t有指数。

我能想到的另一种方式是A['my column'] = B['my column']，但它也不起作用，因为此代码多次执行，总是会覆盖已在A中分配的列。

感谢任何帮助。

更新（示例）

A = pd.DataFrame({'a': np.arange(5)}, index=np.arange(5))
B = pd.DataFrame({'b': ['b', 'b'], 'c': np.random.randint(10, size=2)}, index=np.arange(2))
C = pd.DataFrame({'b': ['c', 'c'], 'c': np.random.randint(10, size=2)}, index=np.arange(2, 4))
print A
print B
print C
A = pd.merge(A, B[['b']], left_index=True, right_index=True, how='left')
print A

A = pd.merge(A, C[['b']], left_index=True, right_index=True, how='left')
# there should be only one column 'b' in A, not 'b_x' and 'b_y'
print A

输出：

   a
0  0
1  1
2  2
3  3
4  4

[5 rows x 1 columns]
   b  c
0  b  0
1  b  2

[2 rows x 2 columns]
   b  c
2  c  2
3  c  3

[2 rows x 2 columns]
   a    b
0  0    b
1  1    b
2  2  NaN
3  3  NaN
4  4  NaN

[5 rows x 2 columns]
   a  b_x  b_y
0  0    b  NaN
1  1    b  NaN
2  2  NaN    c
3  3  NaN    c
4  4  NaN  NaN

[5 rows x 3 columns]

Answer 1

猜猜你在追求什么，也许combine_first会起作用吗？

>>> A.combine_first(B[["b"]]).combine_first(C[["b"]])
   a    b
0  0    b
1  1    b
2  2    c
3  3    c
4  4  NaN

[5 rows x 2 columns]

将特定列合并到数据框中

更新（示例）

1 个答案: