如果列中包含换行符和其后的特定单词,我将尝试返回行。所以'\ nWord'。
这是一个最小的例子
testdf = pd.DataFrame([['test1', ' generates the final summary. \nRESULTS We evaluate the performance of ', ], ['test2', 'the cat and bat \n\n\nRESULTS\n teamed up to find some food'], ['test2' , 'anthropology with RESULTS pharmacology and biology']])
testdf.columns = ['A', 'B']
testdf.head()
> A B
>0 test1 generates the final summary. \nRESULTS We evaluate the performance of
>1 test2 the cat and bat \n\n\nRESULTS\n teamed up to find some food
>2 test2 anthropology with RESULTS pharmacology and biology
listStrings = { '\nRESULTS\n'}
testdf.loc[testdf.B.apply(lambda x: len(listStrings.intersection(x.split())) >= 1)]
这不返回任何内容。
我要生成的结果是返回前两行,因为它们包含'\ nRESULTS',但是没有返回最后一行,因为它没有'\ nRESULTS'
所以
> A B
>0 test1 generates the final summary. \nRESULTS We evaluate the performance of
>1 test2 the cat and bat \n\n\nRESULTS\n teamed up to find some food
答案 0 :(得分:1)
通常我们将str.contains
与regex=False
一起使用
testdf[testdf.B.str.contains('\n',regex=False)]
答案 1 :(得分:1)
您可以尝试以下操作吗:
import re
df1 = testdf[testdf['B'].str.contains('\nRESULTS', flags = re.IGNORECASE)]
df1
#output
A B
0 test1 generates the final summary. \nRESULTS We eva...
1 test2 the cat and bat \n\n\nRESULTS\n teamed up to f...
答案 2 :(得分:1)
WeNYoBen的解决方案更好,但是使用iloc
和np.where
的解决方案是:
>>> testdf.iloc[np.where(testdf['B'].str.contains('\n', regex=False))]
A B
0 test1 generates the final summary. \nRESULTS We eva...
1 test2 the cat and bat \n\n\nRESULTS\n teamed up to f...
>>>