Question

我有字典和下面的清单

correction =  {u'drug.ind': u'Necrosis', "date": "exp"}
drugs =  [[u'drug.aus', u'Necrosis'], [u'drug.nz', u'Necrosis'], [u'drug.uk', u'Necrosis'], [u'drug.ind', u'Necrosis'], [u'cheapest', u'drug.ind'], [u'date', u'']]

现在基本上，我将查看校正字典值，并且只要它与drugs列表中列表的第二个元素匹配，便将其删除。

这就是我要做的

if correction and drugs:
    for i,x in correction.items():
        for j,k in enumerate(drugs):
            if len(i.split(".")) > 1:  # need to do the operation only for drugs which is always given in this format
                if x == k[1]:
                    drugs.pop(j)

理想情况下，drugs列表现在应该看起来像

drugs = [['cheapest', 'drug.ind'], ['date', '']]

但是由于某种原因，它看起来像

[['drug.nz', 'Necrosis'], ['drug.ind', 'Necrosis'], ['cheapest', 'drug.ind'], ['date', '']]

我希望所有看起来像坏死的东西都将被删除。但是，它还是将其删除。

为什么我会遇到这种行为？我在做什么错了？

Answer 1

您正在遍历列表（drugs），并且在循环内，您正在从同一列表中删除元素。

在可迭代对象上执行for循环时，Python会不断增加内部的“索引”变量，该变量有助于Python跟踪列表中当前的项目。

在循环中，假设您删除索引= 3处的项目。现在，列表的其余部分（尚未迭代的项目）将移动一位。先前在索引4处存在的项目现在将在已删除项目腾空的索引3中存在。为了在下一次迭代中处理此移位的项目，内部“ index”变量必须再次为下一次迭代取值3。但是Python会在下一次迭代中将index变量从3递增到4，就像通常从一次迭代到另一次迭代一样。结果是，紧接已删除项目的项目将不会被for循环的主体检查/处理（因为索引为4而不是3），因此即使符合删除标准也不会被删除。

几种解决方案

在此thread，建议使用几种方法进行“安全”删除。

我从其中选出了我最喜欢的一个，并在下面的代码中实现了它：

correction =  {u'drug.ind': u'Necrosis', "date": "exp"}
drugs =  [[u'drug.aus', u'Necrosis'], [u'drug.nz', u'Necrosis'], [u'drug.uk', u'Necrosis'],
          [u'drug.ind', u'Necrosis'], [u'cheapest', u'drug.ind'], [u'date', u'']]

if correction and drugs:
    for i,x in correction.items():
        for j in range(len(drugs)-1, -1, -1):
            if len(i.split(".")) > 1:  # need to do the operation only for drugs which is always given in this format
                if x == drugs[j][1]:
                    drugs.pop(j)
print(drugs)

此输出为：

[['cheapest', 'drug.ind'], ['date', '']]

此解决方案的关键方面在for j in range(len(drugs)-1, -1, -1)行中。现在，我们遍历索引，而不是遍历那些索引的项目。并且我们以反向顺序遍历索引（这实际上意味着我们以反向顺序间接处理列表）。

Answer 2

因为当您从数组中弹出一个项目时，它会将列表中下一个项目的索引更改为位于迭代器“之后”。

在下面的示例中，您看到我们实际上只对数组中的所有其他项运行print（），即使表面上我们迭代遍历数组删除了所有成员，但最终只删除了一半

example = ['apple','banana','carrot','donut','edam','fromage','ghee','honey']

for index,food in enumerate(example):
    print(food);
    example.pop(index)

print(example)

这是因为for循环（基本上）在每个循环上递增整数i并在从example[i]弹出元素时得到example，这会改变之后的元素，因此example[i]会发生变化。

此代码演示了这一事实，正如您在“弹出”一个元素之后所看到的那样，下一个元素在我们眼前发生了变化。

example = ['apple','banana','carrot','donut','edam','fromage','ghee','honey']


for i in range(0,len(example)-1):
    print("The value of example[",i,"] is: ",example[i+1])
    example.pop(i)
    print("after popping ,the value of example[",i,"] is: ",example[i+1])

print(example)

Answer 3

正如其他人所提到的，当您遍历列表或其他可迭代列表时，不应更改它。如果要删除某些元素，则应创建要删除的那些项的列表，然后再删除它们：

bad = []
for j, k in enumerate(drugs):
    if len(i.split(".")) > 1:
        if x == k[1]:
            bad.append(k)
for item in bad:
    drugs.remove(item)

由mentioned的fountainhead来说，如果drugs中有相等的元素，则该解决方案可能会失败，其中某些元素将被删除，而另一些元素如果索引本身是其中的一部分则不会被删除条件。一个更通用的解决方案可能是：

import itertools

bad = []
for j, k in enumerate(drugs):
    if len(i.split(".")) > 1 and x == k[1]:
        bad.append(True)
    else:
        bad.append(False)
drugs = list(itertools.compress(drugs, bad))

Answer 4

您可以根据correction字典的值创建一个集合（用于快速查找），并使用函数filter()来过滤列表：

corr = set(correction.values())

list(filter(lambda x: x[1] not in corr, drugs))
# [['cheapest', 'drug.ind'], ['date', '']]

在迭代列表时，Python不会从列表中删除所有项目

4 个答案: