我的问题是在这个链接中回答得很好的问题的延伸:
我在下面发布了答案,当字符串包含单词“ball”时,字符串会被过滤掉:
In [3]: df[df['ids'].str.contains("ball")]
Out[3]:
ids vals
0 aball 1
1 bball 2
3 fball 4
现在我的问题是:如果我的数据中有长句,我想识别带有“ball”和“field”字样的字符串怎么办?因此当它们中只有一个出现时,它会丢弃包含单词“ball”或“field”的数据,但保留字符串中包含两个单词的数据。
答案 0 :(得分:5)
df[df['ids'].str.contains("ball") & df['ids'].str.contains("field")]
会变成:
contains_balls = df['ids'].str.contains("ball")
contains_fields = df['ids'].str.contains("field")
filtered_df = df[contains_balls & contains_fields]
如果您使用的是更简洁的代码:
>>> from rx import Observable
>>> from collections import defaultdict
>>> source = Observable.from_(['A', 'The', 'the', 'LAZY', 'Low'])
>>> result = defaultdict(list)
>>> def add(value):
... value_content = '{0}'.format(value)
... result[value_content[0].lower()].append(value_content)
...
>>> s = source.subscribe(on_next=lambda value: add(value), on_completed=lambda: print('Finished'))
Finished
>>> result
defaultdict(<class 'list'>, {'a': ['A'], 't': ['The', 'the'], 'l': ['LAZY', 'Low']})
答案 1 :(得分:2)
如果你有2个以上,你可以使用它..(注意速度不如foxyblue的方法)
l = ['ball', 'field']
df.ids.apply(lambda x: all(y in x for y in l))
答案 2 :(得分:0)
您可以使用np.logical_and.reduce
和str.contains
来处理多个单词。
df[np.logical_and.reduce([df['ids'].str.contains(w) for w in ['ball', 'field']])]
In [96]: df
Out[96]:
ids
0 ball is field
1 ball is wa
2 doll is field
In [97]: df[np.logical_and.reduce([df['ids'].str.contains(w) for w in ['ball', 'field']])]
Out[97]:
ids
0 ball is field
答案 3 :(得分:0)
另一种RegEx方法:
// SELECTING FEATURED IMAGE
// if any images have been selected for feature, --- add it to front of array
if(req.body.feature && req.body.feature.length) {
for(var i = 0; i < req.body.feature.length; i++) {
var index2 = foundListings.currentimages.indexOf(req.body.feature[i]);
foundListings.currentimages.splice(index2, 1);
}
}
foundListings.save();
}