从完全匹配单词的句子列表中获取句子:Python

时间:2018-09-28 11:07:00

标签: python python-3.x

假设我有一个句子列表:

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]

我想返回所有具有完整单词“ chocolate”(即["Chocolate is loved by all.", "chocolate is made from cocoa."])的句子。 如果任何句子中没有单词“ chocolate”,则不应返回该句子。 “ chocolateyyy”一词也不应返回。

如何在Python中执行此操作?

4 个答案:

答案 0 :(得分:5)

这将确保search单词实际上是一个完整单词,而不是像'chocolateyyy'这样的子单词。它也不区分大小写,因此尽管首字母大写不同,但'Chocolate'='chocolate'。

sent = ["Chocolate is loved by all.", "Brazil is the biggest exporter of coffee.",
        "Tokyo is the capital of Japan.","chocolate is made from cocoa.", "Chocolateyyy"]

search = "chocolate"

print([i for i in sent if search in i.lower().split()])

为清晰起见,这里有一个扩展版本,并带有解释:

result = []
for i in sent: # Go through each string in sent
    lower = i.lower() # Make the string all lowercase
    split = lower.split(' ') # split the string on ' ', or spaces
                     # The default split() splits on whitespace anyway though
    if search in split: # if chocolate is an entire element in the split array
        result.append(i) # add it to results
print(result)

我希望这会有所帮助:)

答案 1 :(得分:3)

您需要:

<div style="display:block; margin: 2px auto; height: 20px; width: 200px;">
  <input type="text">
  <br>
  <input type="text">
</div>

输出

filtered_sent = [i for i in sent if 'chocolate' in i.lower()]

答案 2 :(得分:2)

this question中,您需要the re library中的某些方法。特别是:

  

\ b匹配空字符串,但仅在单词的开头或结尾处匹配。

因此,您可以使用re.search(r'\bchocolate\b', your_sentence, re.IGNORECASE)搜索“巧克力”。

解决方案的其余部分仅是遍历句子列表并返回与目标字符串匹配的子列表。

答案 3 :(得分:1)

您可以在python中使用正则表达式库:

import re

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]
match_string = "chocolate"
matched_sent = [s for s in sent if len(re.findall(r"\bchocolate\b", s, re.IGNORECASE)) > 0]
print (matched_sent)