按字符串列表过滤文本字符串列表

时间:2017-03-28 19:51:29

标签: python string list if-statement

我有一个字符串列表,每个字符串都包含几个叫做事件的单词。 名为mostcom20的第二个列表包含字符串,每个字符串只包含一个单词。

我的目标是过滤掉第一个列表中不包含第二个列表中至少一个单词的所有字符串。

我有:

happenings=["i have a dog","i want a dog","i like cats","i m hungry"]
mostcom20=["dog","cat"]

结果我想要一个如下列表:

newlist=["i have a dog","i want a dog","i like cats"]

这是我的代码:

newlist=[]
for s in happenings:
    for n in s.split():
        if n in mostcom20:
            newlist.append(s)
newlist

它没有给出错误消息。 它返回一个空列表。 任何人都知道为什么? 谢谢你的帮助!

4 个答案:

答案 0 :(得分:0)

试试这个:

newlist=[]
for s in happenings:
    for n in s.split():
        if n in mostcom20:
            newlist.append(s)
print(newlist)

这是输出

['i have a dog', 'i want a dog']

如果你想在列表中“我喜欢猫”你必须将 cats 添加到常用词

答案 1 :(得分:0)

这是您的计划,但作为列表理解。我也重命名了变量,所以你不要在两个循环中使用相同的变量,并将猫变成猫,所以你也可以得到它们。

newlist = [happening for happening in happenings for word in happening.split() if word in mostcom20]
print(newlist)

答案 2 :(得分:0)

你的问题似乎是你不打印。如果在运行脚本时需要输出,则必须明确print,例如python my_script.py。在交互式解释器会话中,每个表达式的结果都会自动打印(因为它是一个read-evaluate-print-loop,一个REPL)。但是,循环中存在逻辑错误。一旦找到句子中的单词,你需要突破内循环,因为如果你的mostcom20列表中有多个单词在句子中,或者单个单词出现不止一次,那么你的句子将会多次添加。考虑:

In [30]: happenings=["i have a dog and a cat","i want a dog","i like cats","i m hungry"]
    ...:
    ...: mostcom20=["dog","cat"]
    ...:

In [31]: newlist=[]
    ...: for s in happenings:
    ...:     for n in s.split():
    ...:         if n in mostcom20:
    ...:             newlist.append(s)
    ...:

In [32]: newlist
Out[32]: ['i have a dog and a cat', 'i have a dog and a cat', 'i want a dog']

请记住break

In [33]: newlist=[]
    ...: for s in happenings:
    ...:     for n in s.split():
    ...:         if n in mostcom20:
    ...:             newlist.append(s)
    ...:             break
    ...:

In [34]: newlist
Out[34]: ['i have a dog and a cat', 'i want a dog']

您还可以将any与生成器表达式一起使用:

In [35]: newlist=[]
    ...: for s in happenings:
    ...:     if any(w in mostcom20 for w in s.split()):
    ...:         newlist.append(s)
    ...:

In [36]: newlist
Out[36]: ['i have a dog and a cat', 'i want a dog']

答案 3 :(得分:0)

您可能希望将短正则表达式与filter()结合使用:

import re

happenings = ["i have a dog cat dog","i want a dog","i like cats","i m hungry"]
mostcom20 = ["dog","cat"]

rx = re.compile(r'{}'.format('|'.join(mostcom20)))
filtered = list(filter(lambda x: rx.search(x), happenings))
print(filtered)
# ['i have a dog cat dog', 'i want a dog', 'i like cats']