Question

我有两个DataFrame，分别是A和B。 B是通过将A的行进行混排而生成的。我想知道B的每一行，A中同一行的索引是什么。

示例：

A=pd.DataFrame({"a":[1,2,3],"b":[1,2,3],"c":[1,2,3]})
B=pd.DataFrame({"a":[2,3,1],"b":[2,3,1],"c":[2,3,1]})

A
    a   b   c
0   1   1   1
1   2   2   2
2   3   3   3

B
    a   b   c
0   2   2   2
1   3   3   3
2   1   1   1

答案应为[1,2,0]，因为B等于A.loc[[1,2,0]]。我想知道如何有效地执行此操作，因为我的A和B很大。

Answer 1

我使用Dataframe.merge提出了可能的解决方案

A=pd.DataFrame({"a":[1,2,3],"b":[1,2,3],"c":[1,2,3]})
B=pd.DataFrame({"a":[2,3,1],"b":[2,3,1],"c":[2,3,1]})

A['index_a'] = A.index
B['index_b'] = B.index

merge_df= pd.merge(A, B, left_on=['a', 'b', 'c'], right_on=['a', 'b', 'c'])

merge_df在哪里

   a  b  c  index_a  index_b
0  1  1  1        0        2
1  2  2  2        1        0
2  3  3  3        2        1

现在，您可以引用A或B数据框中的行

示例您知道在0的索引为A的行在2的索引为B的地方

注意在两个数据帧上都不匹配的行将不会显示在merge_df
中

Answer 2

IIUC使用merge

pd.merge(B.reset_index(), A.reset_index(), 
                      left_on = A.columns.tolist(),
                      right_on = B.columns.tolist()).iloc[:,-1].values

array([1, 2, 0], dtype=int64)

随机数据帧的索引顺序

2 个答案: