Question

我正在清理一些文本数据，但我无法找到包含某些字符串的行。如果我做一个简单的布尔值，我得到：

'<! [CDATA[! function( d,s, id){varjs, fjs=d. getElementsByTagName( s)[0],p= ^' in articles.loc[25111, 'content']

True

但如果我选择具有完全相同字符串的行，我会得到一个空的数据帧：

articles[articles['content'].str.contains('<! [CDATA[! function( d,s, id){varjs, fjs=d. getElementsByTagName( s)[0],p= ^')]

id  title   author  date    content year    month   publication category    digital section url stems

为什么会发生这种情况？

Answer 1

我认为某些值被视为正则表达式，因此需要str.contains中的参数regex=False。

s = '<! [CDATA[! function( d,s, id){varjs, fjs=d. getElementsByTagName( s)[0],p= ^'
articles[articles['content'].str.contains(s, regex=False)]

＆＃39; str.contains＆＃39;不返回数据帧中的值

1 个答案: