提取数据框列与

时间:2018-04-05 08:51:06

标签: python pandas

我有两个数据框df1df2

df1 = pd.DataFrame({'1': [0,1,2,3,4,5],
                    '2': [6,7,8,9,10,11],
                    'a': [3,6,9,12,15,18]})

df2 = pd.DataFrame({'1': [0,1,2,33,40,5],
                    '2': [6,7,8,99,10,11],
                    'b': [30,60,90,120,150,180],
                    'c': [-1,-2,-3,-4,-5,-6]})

我想提取df1的行索引,其中12出现在df2的相应列中。换句话说,我想提取df1的行索引,否则在两个数据帧之间进行inner连接时会遵循这些行索引:

pd.merge(df1,
         df2,
         on=['1','2'], 
         how='inner') 

显而易见的答案是执行inner连接和提取索引,但我想知道是否有办法在不执行内连接的情况下找到索引?

1 个答案:

答案 0 :(得分:1)

是的,如果https://reactjs.org/docs/render-props.html set_indexIndex.intersection同时使用MultiIndex,则可以使用DataFrame

mux = df1.set_index(['1','2']).index.intersection(df2.set_index(['1','2']).index)
print (mux)
MultiIndex(levels=[[0, 1, 2, 5], [6, 7, 8, 11]],
           labels=[[0, 1, 2, 3], [0, 1, 2, 3]],
           names=['1', '2'],
           sortorder=0)

如有必要,可以通过MultiIndex.to_frame将其转换为DataFrame

df = mux.to_frame()
print (df)
      1   2
1 2        
0 6   0   6
1 7   1   7
2 8   2   8
5 11  5  11
df11 = df1.set_index(['1','2'])
df22 = df2.set_index(['1','2'])
mux = df11.index.intersection(df22.index)
print (mux)
MultiIndex(levels=[[0, 1, 2, 5], [6, 7, 8, 11]],
           labels=[[0, 1, 2, 3], [0, 1, 2, 3]],
           names=['1', '2'],
           sortorder=0)

df = df11.loc[mux]
print (df)
       a
1 2     
0 6    3
1 7    6
2 8    9
5 11  18