我有这样的数据框,
+-------+--------+
| A | B |
+-------+--------+
| David | Frank |
| Tim | David |
| Joe | Sam |
| Frank | Bob |
| Cathy | Tarun |
| | Rachel |
| | Tim |
+-------+--------+
现在,我希望vlookup彼此的列,并找到缺少值,
+-------+--------+-------------------+-------------------+
| A | B | C | D |
+-------+--------+-------------------+-------------------+
| David | Frank | Available on both | Available on both |
| Tim | David | Available on both | Available on both |
| Joe | Sam | in A not in B | in B not in A |
| Frank | Bob | Available on both | in B not in A |
| Cathy | Tarun | in A not in B | in B not in A |
| | Rachel | | in B not in A |
| | Tim | | Available on both |
+-------+--------+-------------------+-------------------+
答案 0 :(得分:3)
您可以将numpy.select
用isin
创建的条件用于检查成员身份,将notnull
用于过滤缺失值:
print (df)
A B
0 David Frank
1 Tim David
2 Joe Sam
3 Frank Bob
4 Cathy Tarun
5 NaN Rachel
6 NaN Tim
df['C'] = np.select([df.A.isin(df.B), df.A.notnull()],
['Available on both', 'in A not in B'], default=None)
df['D'] = np.select([df.B.isin(df.A), df.B.notnull()],
['Available on both', 'in B not in A'], default=None)
print (df)
A B C D
0 David Frank Available on both Available on both
1 Tim David Available on both Available on both
2 Joe Sam in A not in B in B not in A
3 Frank Bob Available on both in B not in A
4 Cathy Tarun in A not in B in B not in A
5 NaN Rachel None in B not in A
6 NaN Tim None Available on both