Question

我想查找字符串中出现子字符串的次数。我这样做了

termCount = content.count(term)

但如果我像'＆＃34;福特＆＃34;它返回结果集，如

"Ford Motors"         Result: 1 Correct
"cannot afford Ford"  Result: 2 Incorrect
"ford is good"        Result: 1 Correct

搜索字词可以有多个术语，例如＆＃34;福特汽车＆＃34;或者＆＃34;福特汽车＆＃34;。例如，如果我搜索＆＃34;福特汽车＆＃34;

"Ford Motors"               Result: 1 Correct
"cannot afford Ford Motor"  Result: 1 Correct
"Ford Motorway"             Result: 1 InCorrect

我想要的是搜索它们不区分大小写并且作为一个整体。意思是如果我搜索一个子字符串，它应该作为一个整体包含在单词或短语中（如果是多个术语）不是该单词的一部分。而且我还需要计算条款。我该如何实现它。

Answer 1

您可以使用regex，在这种情况下使用re.findall，然后获取匹配列表的长度：

re.findall(r'\byour_term\b',s)

Demo

>>> s="Ford Motors cannot afford Ford Motor Ford Motorway Ford Motor."
>>> import re
>>> def counter(str,term):
...    return len(re.findall(r'\b{}\b'.format(term),str))
... 
>>> counter(s,'Ford Motor')
2
>>> counter(s,'Ford')
4
>>> counter(s,'Fords')
0

Answer 2

我会用空格分割字符串，以便我们有独立的单词然后从那里开始计算。

terms = ['Ford Motors', 'cannot afford Ford', 'ford is good'];
splitWords = [];

for term in terms:
    #take each string in the list and split it into words
    #then add these words to a list called splitWords.

    splitWords.extend(term.lower().split())

print(splitWords.count("ford"))

Python - 搜索子词全文

2 个答案: