我需要检查一个字符串是否包含列表的任何元素。我目前正在使用这种方法:
engWords = ["the", "a", "and", "of", "be", "that", "have", "it", "for", "not"]
engSentence = "the dogs fur is black and white"
print("the english sentence is: " + engSentence)
engWords2 = []
isEnglish = 0
for w in engWords:
if w in engSentence:
isEnglish = 1
engWords2.append(w)
if isEnglish == 1:
print("The sentence is english and contains the words: ")
print(engWords2)
这个问题是它给出了输出:
the english sentence is: the dogs fur is black and white
The sentence is english and contains the words:
['the', 'a', 'and', 'it']
>>>
你可以看到' a'并且'它'不应该在场。我如何搜索,以便它只列出单个单词,而不是单词的一部分?我对使用普通python代码或正则表达式的任何想法持开放态度(虽然我对python和regex都很新,所以请不要太复杂)谢谢。
答案 0 :(得分:5)
它找到了这两个词,因为它们分别是“黑色”和“白色”的子串。当您将“in”应用于字符串时,它只会查找字符的子字符串。
尝试:
engSentenceWords = engSentence.split()
后来,
if w in engSentenceWords:
将原始句子分成单个单词列表,然后检查整个单词值。
答案 1 :(得分:0)
words = set(engSentence.split()).intersection(set(engWords))
if words:
print("The sentence is english and contains the words: ")
print(words)
将engSentence拆分为列表中的标记,将其转换为集合,将engWords转换为集合,并找到交集(公共重叠)。然后检查这是否为非空,如果是,则打印出找到的单词。
答案 2 :(得分:0)
甚至更简单,为您的句子和搜索词添加空格:
engWords = ["the", "a", "and", "of", "be", "that", "have", "it", "for", "not"]
engSentence = "the dogs fur is black and white"
print("the english sentence is: " + engSentence)
engWords2 = []
isEnglish = 0
engSentence += " "
for w in engWords:
if "%s " % w in engSentence:
isEnglish = 1
engWords2.append(w)
if isEnglish == 1:
print("The sentence is english and contains the words: ")
print(engWords2)
输出是:
the english sentence is: the dogs fur is black and white
The sentence is english and contains the words:
['the', 'and']
答案 3 :(得分:0)
您可能想要使用正则表达式匹配。尝试类似下面的内容
import re
match_list = ['foo', 'bar', 'eggs', 'lamp', 'owls']
match_str = 'owls are not what they seem'
match_regex = re.compile('^.*({1}).*$'.format('|'.join(match_list)))
if match_regex.match(match_str):
print('We have a match.')
有关详细信息,请参阅python.org上的re
文档。