我正在尝试在两个字符串之间匹配模式。例如,我有
pattern_search = ['education four year']
string1 = 'It is mandatory to have at least of four years of professional education'
string2 = 'need to have education four years with professional degree'
当我尝试在pattern_search与string1和string2之间找到匹配项时,我正在尝试一种说法。
当我使用正则表达式库时,match / search / findall对我没有帮助。在字符串中,我具有所有必需的单词,但没有顺序排列,在字符串2中,我有一个额外的单词,并添加了复数形式。
当前,我将预处理后的pattern_search中的每个单词与string1&2中的每个单词拆分为字符串,是否有办法找到句子之间的匹配项?
答案 0 :(得分:2)
您应该对difflib
库有个很好的了解,特别是get_close_matches
函数,该函数返回“足够接近”的单词来满足可能不完全匹配的单词的要求。请确保相应地调整阈值(cutoff=
。
from difflib import get_close_matches
from re import sub
pattern_search = 'education four year'
string1 = 'It is mandatory to have at least of four years of professional education'
string2 = 'need to have education four years with professional degree'
string3 = 'We have four years of military experience'
def match(string, pattern):
pattern = pattern.lower().split()
words = set(sub(r"[^a-z0-9 ]", "", string.lower()).split()) # Sanitize input
return all(get_close_matches(word, words, cutoff=0.8) for word in pattern)
print(match(string1, pattern_search)) # True
print(match(string2, pattern_search)) # True
print(match(string3, pattern_search)) # False
如果要使pattern_search
成为模式列表,则可能应该遍历match
函数。
答案 1 :(得分:-1)
尝试一下:
def have_same_words(string1, string2):
return sorted(string1.split()) == sorted(string2.split())
print(have_same_words("It is mandatory to have at least of four years of professional education", "education four year"))
如果有帮助,请接受答案。
答案 2 :(得分:-2)
在Python中检查一个字符串是否包含另一个字符串,您可以尝试以下几种操作:
使用于
>>> pattern_search in string
True
或者找到
>>> string1.find(pattern_search)
[returns value greater than 1 if True or -1 if False]