Question

我有一个字典，其中键为ID，值为字符串。我还有两个单独的关键字列表。我需要过滤掉字典中的所有键，其中的值至少包含列表1中的一个关键字，以及列表2中的至少一个关键字。我很困惑如何解决这个问题。请帮忙。

到目前为止，这就是我所拥有的：

# code loads all data from al.csv into a dictionary where the key is column 1 which is tweet ID and value is the the whole row including tweet ID.
reader = csv.reader(open('al.csv', 'r'))
overallDict = {}
for rows in reader:
    k = rows[0]
    v = rows[0] + ',' + rows[1] + ',' + rows[2] + ',' + rows[3] + ',' + rows[4] + ',' + rows[5] + ',' + rows[6] + ',' + rows[7] + ',' + rows[8] + ',' + rows[9]
    overallDict[k] = v

# The following lines of code loop loads keywords list
with open('slangNames.txt') as f:
    slangs = f.readlines()

# To strip new-line and prepare data into finished keywords list
strippedSlangs = []
for elements in slangs:
    elements = elements.strip()
    strippedSlangs.append(elements)

# The following lines of code loop loads risks list
with open('riskNames.txt') as f:
    risks = f.readlines()

# To strip new-line and prepare data into finished risks list
strippedRisks = []
for things in risks:
    things = things.strip()
    strippedRisks.append(things)

说List1 = [鸦片，圣诞节，杂草] 和List2 = [药物，有害，不好] 和词典= {213432：'鸦片对健康有害'，321234：'圣诞节好'，543678：'杂草不好'}

所需的输出需要是列表：输出：[213432,543678]因为这两个相应的推文包含来自list1的至少一个值和来自list2的一个值。

Answer 1

首先，我不得不重写你的代码，以便更容易地弄清楚它在做什么：

strippedRisks = set()
strippedSlangs = set()
overallDict = {}

with open('al.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
       overallDict[row[0]] = ",".join(row[1:])

with open('slangNames.txt') as f:
    for line in f:
        elements = line.strip()
        strippedSlangs.add(elements)

with open('riskNames.txt') as f:
    for line in f:
        things = line.strip()
        strippedRisks.add(things)

好。您想知道词典中的哪些键在每个列表中都有值？换句话说，您想知道哪个values字典有一个不允许的单词。

你可以这样做：

for key, value in overallDict.items():
  if set(value.split(',')).intersection(strippedSlangs):
     # some words appear in strippedSlangs
  elif set(value.split(',')).intersection(strippedRisks)
     # some words appear in strippedRisks

但是，既然我已经看到了你想要做的事情，我只是从头开始使用集合并首先构建不允许的单词：

strippedRisks = set()
strippedSlangs = set()
overallDict = {}

with open('slangNames.txt') as f:
    for line in f:
        elements = line.strip()
        strippedSlangs.add(elements)

with open('riskNames.txt') as f:
    for line in f:
        things = line.strip()
        strippedRisks.add(things)

with open('al.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        values = set(row[1:])
        if strippredRisks.intersection(values) and strippedSlangs.intersection(values):
            # Words in both bad-word lists. Do we skip these or save them?
            pass
        else:
            overallDict[row[0]] = values

我相信那是你想要完成的事情，但我并不完全确定。

在字典中检查来自2个单独列表的关键字

1 个答案: