Question

我已经在python中编写了一些代码以从列表中删除唯一数字，因此请输入以下内容：

[1,2,3,2,1]

它应该返回

[1,2,2,1]

但是我的程序返回了

[1,2,1]

我的代码是：

for i in data:
    if data.count(i) == 1:
        data.pop(i)

我发现错误发生在if data.count(i) == 1:。当列表中明显有2个数字2出现时，它说data.count(2) == 1。我不明白为什么这会给出错误的答案

Answer 1

如果列表很长，则应将所有数字都放在Counter(iterable) - dictionary中。

from collections import Counter
data = [1,2,3,2,1]
c = Counter(data)

cleaned = [x for x in data if c[x] > 1]

print(cleaned)

这将通过列表（O(n)）的一遍计算所有出现的次数，并且查找在创建的字典中出现的频率为O(1)。一起使用，比使用

这样的列表理解要快得多

result = [x for x in data if data.count(x) > 1]

对于100个值的列表，它将遍历您的100个值100次，以计算其中的每个值都是O（n ^ 2）-不好。

输出：

[1,2,2,1]

Answer 2

尝试追加到新列表，而不要更改旧列表：

res = []
data = [1,2,3,2,1]

for i in data:
    if data.count(i) > 1:
        res.append(i)

在迭代过程中更改列表大小是不明智的做法，pop可以做到。这将返回res = [1, 2, 2, 1]

Answer 3

这是一个递归问题。您误解了list.pop()。它采用索引而不是特定元素。因此，您没有删除期望的内容。

这里要做的是使用enumerate，

// swift-tools-version:4.0
import PackageDescription

let package = Package(
    name: "timeshare",
    dependencies: [
        //  A server-side Swift web framework.
        .package(url: "https://github.com/vapor/vapor.git", from: "3.0.0"),
        // Custom dependencies
        .package(url: "https://github.com/malcommac/SwiftDate.git", from: "5.0.0"),
    ],
    targets: [
        .target(name: "App", dependencies: ["Vapor", "SwiftDate"]),
        .target(name: "Run", dependencies: ["App"]),
        .testTarget(name: "AppTests", dependencies: ["App"]),
    ]
)

这样，您可以将项目弹出一个正确的索引。

编辑

由于@wim的评论，我进行了编辑。我现在要遍历原始列表的副本（data = [1,2,3,2,1] #You could use dup_list = data[:] for python 3.2 and below dup_list = data.copy() for index,item in enumerate(dup_list): if dup_list.count(item) == 1: data.pop(index)），以免同时迭代和变异原始列表。

此外，为了解释起见，我显式创建了一个副本。但是您可以使用较短版本的代码

dup_list

请注意，我添加了一条注释，因为此语法可能使某些人感到困惑。

Answer 4

使用列表理解的解决方案

我相信，使用列表理解可以是更Python化的答案：

result = [x for x in data if data.count(x) > 1]

示例列表的解决方案时间比较

我已将C.Nivis和Patrick Artner的答案移入函数中以更轻松地在其上运行timeit。

要考虑调用该函数所需的时间，我还将列表理解包装到一个函数调用中。

设置

def remove_singletons(data):
    """Return list with no singleton using for loops."""
    res = []
    for i in data:
        if data.count(i) > 1:
            res.append(i)
    return res

def remove_singletons_lc(data):
    """Return list with no singleton using for list comprehension."""
    return [x for x in data if data.count(x)>1]

from collections import Counter

def remove_singletons_counter(data):
     c = Counter(data)
     return [x for x in data if c[x] > 1]

import numpy as np

def remove_singletons_numpy(data):
     a = np.array(data)
     _, ids, counts = np.unique(a, return_counts=True, return_inverse=True)
    return a[counts[ids] != 1]

l = [1,2,3,2,1]

带循环的解决方案

%timeit remove_singletons(l)
>>> 1.42 µs ± 46.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

具有列表理解能力的解决方案

%timeit remove_singletons_lc(l)
>>> 1.2 µs ± 17.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

`Counter`的解决方案

%timeit remove_singletons_counter(l)
>>> 6.55 µs ± 143 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

`numpy.unique`的解决方案

%timeit remove_singletons_numpy(l)
>>> 53.8 µs ± 3.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

结论

列表理解似乎稍微，但始终比循环快，并且比使用小列表的Counter快。对于小名单，Numpy是较慢的名单。

大型列表的解决方案时间比较

假设我们有一大堆 n 个随机元素，它们来自 [0，n]

import random
n = 10000
l = [random.randint(0, n) for i in range(n)]

带循环的解决方案

%timeit remove_singletons(l)
>>> 1.5 s ± 64.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

具有列表理解能力的解决方案

%timeit remove_singletons_lc(l)
>>> 1.51 s ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

`Counter`的解决方案

%timeit remove_singletons_counter(l)
>>> 2.65 ms ± 228 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

`numpy.unique`的解决方案

%timeit remove_singletons_numpy(l)
>>> 1.75 ms ± 38.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

大列表的结论

对于大名单，无可争议的获胜者是numpy.unique，紧随其后的是Counter。

最终结论

对于较小的列表，列表理解似乎可以解决问题，但是对于较大的列表， numpy.unique 方法最有效。

Answer 5

在迭代列表时不要修改列表。行为很可能不是所希望的。

`numpy.unique`与`return_counts=True`

另一种选择是使用numpy

a = np.array([1,2,2,3,2,1])
_, ids, counts = np.unique(a, return_counts=True, return_inverse=True)
a[counts[ids] != 1]

对于大型数组，这比列表理解和Counter

快

a = np.array([1,2,2,3,2,1]*1000) #numpy array
b = list(a) # list

然后

%timeit _, ids, c = np.unique(a, return_counts=True, return_inverse=True);a[c[ids] != 1]
225 µs ± 11.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit [x for x in a if b.count(x) > 1]
885 ms ± 23.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit [x for x in a if c[x] > 1]
1.53 ms ± 58.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Answer 6

使用 del 而不是 pop

data = [1,2,3,2,1]

for i in data:
        if data.count(i)== 1:
            index = data. index(i)
            del data[index]
            
print(data)

产生，

[1, 2, 2, 1]

[Program finished]

Answer 7

您可以使用列表理解来建立新列表：

[ x for x in data if data.count(x)>1 ]

此外，pop()方法将元素的索引作为参数而不是值作为参数。

list.count（）表示列表中有一项时有两项

7 个答案:

使用列表理解的解决方案

示例列表的解决方案时间比较

设置

带循环的解决方案

具有列表理解能力的解决方案

`Counter`的解决方案

`numpy.unique`的解决方案

结论

大型列表的解决方案时间比较

带循环的解决方案

具有列表理解能力的解决方案

`Counter`的解决方案

`numpy.unique`的解决方案

大列表的结论

最终结论

`numpy.unique`与`return_counts=True`

list.count（）表示列表中有一项时有两项

7 个答案:

使用列表理解的解决方案

示例列表的解决方案时间比较

设置

带循环的解决方案

具有列表理解能力的解决方案

Counter的解决方案

numpy.unique的解决方案

结论

大型列表的解决方案时间比较

带循环的解决方案

具有列表理解能力的解决方案

Counter的解决方案

numpy.unique的解决方案

大列表的结论

最终结论

numpy.unique与return_counts=True

`Counter`的解决方案

`numpy.unique`的解决方案

`Counter`的解决方案

`numpy.unique`的解决方案

`numpy.unique`与`return_counts=True`