删除具有给定子字符串值的行

时间:2018-10-23 08:55:56

标签: python python-3.x python-2.7 pandas dataframe

在给定col的行中存在子字符串的情况下,从数据框中删除行。

df:

Parent  Child   score
1stqw   Whoert      0.305125
tWowe   Tasert      0.308132
Worert  Picert      0.315145

子字符串= [Wor,Tas]

删除具有子字符串的行。

更新的df:

 Parent Child   score
1stqw   Whoert      0.305125

谢谢!

2 个答案:

答案 0 :(得分:3)

您可以串联然后使用pd.Series.str.contains

L = ['Wor', 'Tas']

df = df[~(df['Parent'] + df['Child']).str.contains('|'.join(L))]

print(df)

  Parent   Child     score
0  1stqw  Whoert  0.305125

有关效率/性能,请参见Pandas filtering for multiple substrings in series

答案 1 :(得分:2)

DataFrame的子集中将str.containsapply一起使用,然后添加any以测试每行至少一个True:

cols = ['Parent', 'Child']
mask = df[cols].apply(lambda x: x.str.contains('|'.join(substrings))).any(axis=1)

通过|(按位或)将boolenam掩码链接在一起:

mask = (df['Parent'].str.contains('|'.join(substrings)) | 
        df['Child'].str.contains('|'.join(substrings)))

df = df[~mask]
print (df)
  Parent   Child     score
0  1stqw  Whoert  0.305125