熊猫:如何返回列所在单元格中包含换行符/换行符(\ n)的行?

时间:2019-06-17 02:26:47

标签: python pandas

如果列中包含换行符和其后的特定单词,我将尝试返回行。所以'\ nWord'。

这是一个最小的例子

testdf = pd.DataFrame([['test1', ' generates the final summary. \nRESULTS We evaluate the performance of ', ], ['test2', 'the cat and bat \n\n\nRESULTS\n teamed up to find some food'], ['test2' , 'anthropology with RESULTS pharmacology and biology']])
testdf.columns = ['A', 'B']
testdf.head()

>   A   B
>0  test1   generates the final summary. \nRESULTS We evaluate the performance of
>1  test2   the cat and bat \n\n\nRESULTS\n teamed up to find some food
>2  test2   anthropology with RESULTS pharmacology and biology

listStrings = { '\nRESULTS\n'}
testdf.loc[testdf.B.apply(lambda x: len(listStrings.intersection(x.split())) >= 1)]

这不返回任何内容。

我要生成的结果是返回前两行,因为它们包含'\ nRESULTS',但是没有返回最后一行,因为它没有'\ nRESULTS'

所以

>   A   B
>0  test1   generates the final summary. \nRESULTS We evaluate the performance of
>1  test2   the cat and bat \n\n\nRESULTS\n teamed up to find some food

3 个答案:

答案 0 :(得分:1)

通常我们将str.containsregex=False一起使用

testdf[testdf.B.str.contains('\n',regex=False)]

答案 1 :(得分:1)

您可以尝试以下操作吗:

import re
df1 = testdf[testdf['B'].str.contains('\nRESULTS', flags = re.IGNORECASE)]
df1
#output
A   B
0   test1   generates the final summary. \nRESULTS We eva...
1   test2   the cat and bat \n\n\nRESULTS\n teamed up to f...

答案 2 :(得分:1)

WeNYoBen的解决方案更好,但是使用ilocnp.where的解决方案是:

>>> testdf.iloc[np.where(testdf['B'].str.contains('\n', regex=False))]
       A                                                  B
0  test1   generates the final summary. \nRESULTS We eva...
1  test2  the cat and bat \n\n\nRESULTS\n teamed up to f...
>>>