获取列表中的最大重复项

时间:2012-06-29 01:34:34

标签: python arrays list

我有这个清单:

mylist = [20, 30, 25, 20, 30]

使用

获取重复值索引后
[i for i, x in enumerate(mylist) if mylist.count(x) > 1]

结果是:

`[0, 1, 3, 4]` 

有两对重复值。我想知道,我怎样才能获得更高的重复值?在此列表中,它是30或其任何索引,14,而不是整个重复值列表。

...问候

6 个答案:

答案 0 :(得分:6)

这是O(n)

>>> from collections import Counter
>>> mylist = [20, 30, 25, 20, 30]
>>> max(k for k,v in Counter(mylist).items() if v>1)
30

答案 1 :(得分:5)

获取最大重复值:

max(x for x in mylist if mylist.count(x) > 1)

由于重复的count()调用,它具有O(n ** 2)性能,不幸的是。这是一种更有效的方法,可以做同样的O(n)性能,如果列表很长很重要:

seen = set()
dups = set()
for x in mylist:
    if x in seen:
        dups.add(x)
    seen.add(x)
max_dups = max(dups)

答案 2 :(得分:1)

另一种O(n)方式,只是因为......

>>> from collections import defaultdict
>>> 
>>> mylist = [20,30,25,20,30]
>>> dd = defaultdict(int)
>>> for i in mylist:
...    dd[i] += 1
...
>>> max(i for i in dd if dd[i] > 1)
30

您也可以使用常规旧词典来完成:

>>> d = dict.fromkeys(mylist, 0)
>>> for i in mylist:
...   d[i] += 1
... 
>>> max(i for i in d if d[i] > 1)
30

答案 3 :(得分:0)

$ cat /tmp/1.py
from itertools import groupby

def find_max_repeated(a):
    a = sorted(a, reverse = True)
    for k,g in groupby(a):
        gl = list(g)
        if len(gl) > 1:
            return gl[0]

a = [1,1,2,3,3,4,5,4,6]
print find_max_repeated(a)

$ python /tmp/1.py
4

答案 4 :(得分:0)

只需考虑一些相对时间:

from collections import Counter
from collections import defaultdict

mylist = [20, 30, 25, 20, 30]

def f1():
    seen = set()
    dups = set()
    for x in mylist:
        if x in seen:
            dups.add(x)
        seen.add(x)
    max_dups = max(dups)

def f2():
    max(x for x in mylist if mylist.count(x) > 1)

def f3():
    max(k for k,v in Counter(mylist).items() if v>1)

def f4():
    dd = defaultdict(int)
    for i in mylist:
        dd[i] += 1

    max(i for i in dd if dd[i] > 1)

def f5():
    d = dict.fromkeys(mylist, 0)            
    for i in mylist:
       d[i] += 1

    max(i for i in d if d[i] > 1)

cmpthese([f1,f2,f3,f4,f5])    

打印:

   rate/sec     f3     f4     f5     f2     f1
f3   93,653     -- -63.3% -73.0% -79.2% -83.6%
f4  255,137 172.4%     -- -26.3% -43.3% -55.3%
f5  346,238 269.7%  35.7%     -- -23.1% -39.3%
f2  450,356 380.9%  76.5%  30.1%     -- -21.0%
f1  570,419 509.1% 123.6%  64.7%  26.7%     --

所以明智地选择

答案 5 :(得分:0)

mylist = [20, 30, 25, 20, 30]
result = max((mylist.count(x), x) for x in set(mylist))
print(result)
>>> (2, 30)

这是它的工作方式:

  • set(mylist)-您只能从mylist创建唯一值 (20,30,25)
  • 然后使用生成器理解来创建元组,该元组的第一个项目出现该值的次数 ((1,25),(2,20),(2,30))
  • 由于元组是逐项比较的,因此您可以在序列中获得最大元组,在这种情况下为(2,30),因为它大于(2,20)