Question

我正在考虑解决此问题的最佳方法。

我基本上想要获取一些文本，并将其与关键字进行比较。

当然，我可以这样做：

keyword = 'python 3.5'
title = 'python 3.5 is a programming language'
if keyword in title:

但是，它必须按顺序排列。如果标题文本恰好是：

title = 'my favourite version of python is 3.5!'

那不行。

所以，我尝试了一种方法，将关键字与.split()分开，然后检查拆分关键字列表中的两个项目是否都在标题变量中，但是没有运气有效的方式。

如果有人知道这样做的好方法，我会非常感激。

Answer 1

这将完成这项工作：

keyword = 'python 3.5'
title = 'python 3.5 is a programming language'
s=set(keyword.split(" "))
m=set(title.split(" "))
if(len(set.intersection(s,m)==len(s)): 
   print(True)

假设你不关心重复。也就是说，你考虑

keyword = 'python 3.5 python'
title = 'python 3.5 is a programming language'

成为一对，其中所有关键字确实在标题内。

Answer 2

因此，您需要在标题中按顺序查找关键短语的每个单词。试试这个：按顺序搜索每个单词;在标题的其余部分继续搜索。

key_phrase = 'python 3.5'
title_list = ['python 3.5 is a programming language',
              'my favourite version of python is 3.5!']

key_word = key_phrase.split()

for title in title_list:
    remain = title.split()
    found = True
    for word in key_word:
        if word in remain:
            pos = remain.index(word)
            remain = remain[pos+1:]
        else:
            found = False

    print title, "\tfound=", found

输出：

python 3.5 is a programming language    found= True
my favourite version of python is 3.5!  found= False

Answer 3

你可以这样做......如果你想自定义匹配案例的精确度。

keyword = 'python 3.5'
title = 'my favourite version of python is 3.5!'
precision = 100 # 100% precision (both python and 3.5 must exist in title)
if len([x for x in set(keyword.split(' ')) if x in title]) >= round(len(set(keyword.split(' ')))*(precision/100)):
    print('Yes')
else:
    print('No')

输出：

'Yes'

如果您将title更改为：

title = 'my favourite version of python is 3.4!'

输出为'No' 但是......对precision进行了一些修改：

precision = 50

输出为'Yes'

Answer 4

我认为你需要all()

title = 'my favourite version of python is 3.5!'

keyword = 'python 3.5'
print all(n in title for n in keyword.split())

keyword = 'hello 3.5'
print all(n in title for n in keyword.split())

keyword = 'hello world'
print all(n in title for n in keyword.split())

keyword = 'python 2.0'
print all(n in title for n in keyword.split())

结果

True
False
False
False

Answer 5

使用内置any()和str.split()功能短单行：

keyword = 'python 3.5'
title = 'my favourite version of python is 3.5!'

print(all(i in title for i in keyword.split()))

输出：

True

Answer 6

你不想比较列表（它很慢），你应该比较集合。作为奖励，issubset已经定义：

title = 'python 3.5 is a programming language'

def contains_all_keywords(sentence, keywords):
  keywords = set(keywords.split())
  return(keywords.issubset(set(sentence.split())))

print(contains_all_keywords(title, 'python 3.5'))
# True
print(contains_all_keywords(title, '3.5 python'))
# True
print(contains_all_keywords(title, 'python 2.7'))
# False

列出与关键字的比较

6 个答案: