pandas vlookup两列并找到值

时间:2018-02-18 18:48:36

标签: python pandas lookup

我有这样的数据框,

+-------+--------+
|   A   |   B    |
+-------+--------+
| David | Frank  |
| Tim   | David  |
| Joe   | Sam    |
| Frank | Bob    |
| Cathy | Tarun  |
|       | Rachel |
|       | Tim    |
+-------+--------+

现在,我希望vlookup彼此的列,并找到缺少值,

+-------+--------+-------------------+-------------------+
|   A   |   B    |         C         |         D         |
+-------+--------+-------------------+-------------------+
| David | Frank  | Available on both | Available on both |
| Tim   | David  | Available on both | Available on both |
| Joe   | Sam    | in A not in B     | in B not in A     |
| Frank | Bob    | Available on both | in B not in A     |
| Cathy | Tarun  | in A not in B     | in B not in A     |
|       | Rachel |                   | in B not in A     |
|       | Tim    |                   | Available on both |
+-------+--------+-------------------+-------------------+

1 个答案:

答案 0 :(得分:3)

您可以将numpy.selectisin创建的条件用于检查成员身份,将notnull用于过滤缺失值:

print (df)
       A       B
0  David   Frank
1    Tim   David
2    Joe     Sam
3  Frank     Bob
4  Cathy   Tarun
5    NaN  Rachel
6    NaN     Tim

df['C'] = np.select([df.A.isin(df.B), df.A.notnull()], 
                    ['Available on both', 'in A not in B'], default=None)
df['D'] = np.select([df.B.isin(df.A), df.B.notnull()], 
                    ['Available on both', 'in B not in A'], default=None)
print (df)
       A       B                  C                  D
0  David   Frank  Available on both  Available on both
1    Tim   David  Available on both  Available on both
2    Joe     Sam      in A not in B      in B not in A
3  Frank     Bob  Available on both      in B not in A
4  Cathy   Tarun      in A not in B      in B not in A
5    NaN  Rachel               None      in B not in A
6    NaN     Tim               None  Available on both