检查该值是否在同一表的任何指定列中

时间:2018-11-24 13:49:27

标签: python-3.x pandas

我想检查一列特定行的值是否存在于另一列中。

df:

   sno  id1 id2 id3 
    1   1,2 7   1,2,7,22
    2   2   8,9 2,8,9,15,17
    3   1,5 6   1,5,6,17,33
    4   4       4,12,18
    5       9   9,14

输出:

对于特定的给定行,

for i  in sno:   
    if id1 in id3 : 
      score = 50
    elif id2 in id3:
      score = 50 

    if id1 in id3 and id2 in id3:
       score = 75

我最终希望我的分数超出逻辑。

1 个答案:

答案 0 :(得分:1)

您可以将所有值转换为带分割的集合,然后按issubset进行比较,and bool(a)也用于省略空集合(由缺失值创建):

print (df)
   sno  id1  id2          id3
0    1  1,2    7   1,20,70,22
1    2    2  8,9  2,8,9,15,17
2    3  1,5    6  1,5,6,17,33
3    4    4  NaN      4,12,18
4    5  NaN    9         9,14

def convert(x):
    return set(x.split(',')) if isinstance(x, str) else set([])

cols = ['id1', 'id2', 'id3']
df1 = df[cols].applymap(convert)

m1 = np.array([a.issubset(b) and bool(a) for a, b in zip(df1['id1'], df1['id3'])])
m2 = np.array([a.issubset(b) and bool(a) for a, b in zip(df1['id2'], df1['id3'])])

df['new'] = np.select([m1 & m2, m1 | m2], [75, 50], np.nan)
print (df)
   sno  id1  id2          id3   new
0    1  1,2    7   1,20,70,22   NaN
1    2    2  8,9  2,8,9,15,17  75.0
2    3  1,5    6  1,5,6,17,33  75.0
3    4    4  NaN      4,12,18  50.0
4    5  NaN    9         9,14  50.0