Question

我在python中编写了一个代码，但我不确定为什么它永远不会匹配它应该能够，你能不会检查我的代码中的问题是什么？

我的假设是我的选择查询创建列表列表“（'坏'，），（'更糟'，），但我不知道如何删除它“，”

我正在使用visual studio和python 3.5：

 if len(list(set(words) & set(badlst))) > 0:
        index +=1

字样:(数据详情）['期限'，'保存'，'保险'，'费用'，'？'，   '升级'，'计划'，'永远'，'首选'，'客户'，'，'，'坏'，   '代理'，'解释'，......]

badlst :(数据详情）[（'bad'，），（'更糟'），（'不可救药'，），   （'穷'，），'''''''，'''''''，''''''，'''''，   （'不可接受的'，），（'悲伤'，），（'糟糕'，），（'crummy'，），（'糟糕'，），   （'粗糙'，），（'合成'，），...]

我生成badlst如下：

def readInsCmp(column, tablename, whereCondition):
    command = "select "+ column+" from "+ tablename +" where "+ whereCondition 
    c.execute(command)
    namelist = c.fetchall()
    return namelist

badlst = readInsCmp("distinct(word)","wordVals","value=-1")

Words参数基于解析来自excel文件的一些输入：

sentenceArr = sent_tokenize(content)
for sentence in sentenceArr:
 words = [word for word in nltk.word_tokenize(sentence) if word not in stopwords.words('english')]

Answer 1

如果我已正确解释问题，您有两个列表：

Source

并且，您想要找出这两个列表是否有任何共同词。如果是这样，首先要做的是将Flow从元组列表转换为单词列表：

>>> words = ['term', 'saving', 'insurance', 'cost', '?', 'upgrade', 'plan', 'always', 'preferred', 'client', ',', 'bad', 'agent', 'explained', ]
>>> badlst = [('bad',), ('worse',), ('incorrigible',), ('poor',), ('dreadful',), ('atrocious',), ('cheap',), ('unacceptable',), ('sad',), ('lousy',), ('crummy',), ('awful',), ('rough',), ('synthetic',), ]

完成后，很容易找到共同的词语：

badlst

我们可以在>>> badlst2 = [x[0] for x in badlst]语句中使用该交集，如下所示：

>>> set(words).intersection(badlst2)
{'bad'}

if语句有效，因为根据python约定，如果一个集合的布尔值为空则为False，否则为True。

作为>>> index = 0 >>> if set(words).intersection(badlst2): ... index += 1 ... >>> index 1语句的替代，我们可以将集合的布尔值添加到if：

if

除了

在python中，index执行bitwise-and，而不是交集。这不是你想要的。

比较python中的字符串列表

1 个答案:

除了