Question

说我有一个清单， list=[1, 1, 2, 3, 3, 3, 4, 5] 我想遍历重复项和仅重复项。（在这种情况下，是1，1，3，3，3）最有效的方法是什么？

Answer 1

可能最有效，最易读的方法是从collections模块创建一个Counter，并且只返回计数至少为2的那些。要保存订单，可以使用计数器但仍会遍历值并检查计数：

import collections

def iterate_duplicates_1(l):
    cnt = collections.Counter(l)
    for item in l:
        if cnt[item] > 1:
            yield item

>>> list(iterate_duplicates_1([1, 1, 2, 3, 3, 3, 4, 5]))
[1, 1, 3, 3, 3]

如果顺序无关紧要，您可以直接在Counter上进行迭代（可能会稍微快一点）：

import collections

def iterate_duplicates_2(l):
    cnt = collections.Counter(l)
    for item, count in cnt.items():
        if count > 1:
            for _ in range(count):
                yield item

>>> list(iterate_duplicates_2([1, 1, 2, 3, 3, 3, 4, 5]))
[1, 1, 3, 3, 3]

如果您使用其他列表，则区别会变得明显：

>>> l = [1, 3, 1, 3, 2]
>>> list(iterate_duplicates_1(l))
[1, 3, 1, 3]

>>> list(iterate_duplicates_2(l))
[1, 1, 3, 3]

Answer 2

如果您愿意使用大熊猫，那这是一种单线：

app.use(bodyParser.urlencoded({ extended: true }));

输出：

import pandas as pd
l=[1, 1, 2, 3, 3, 3, 4, 5]
pd.Series(l).loc[lambda x : x.duplicated(keep=False)].tolist()

遍历列表中的重复项

2 个答案: