快速multiset包含python中的列表

时间:2014-07-14 11:19:33

标签: python multiset containment

我有一个来自表查找的int迭代器,需要检查他们的multiset是否包含在给定的固定" multiset" ms。目前,我在开头对ms进行排序,然后在我的迭代器中对int进行排序,并按如下方式检查多重包含(排序列表):

vals = sorted(my_vals)
for it in ... :
    test_vals = sorted( i for i in it )
    if is_sublist(test_vals,vals):
        # do stuff

,其中

def is_sublist(L1,L2):
    m = len(L1)
    n = len(L2)
    i = j = 0
    while j <= n:
        if i == m:
            return True
        elif j == n:
            return False
        a,b = L1[i], L2[j]
        if a == b:
            i += 1
            j+= 1
        elif a > b:
            j += 1
        else:
            return False
  • 通常,我的名单很短(1-20个元素)
  • 我尝试使用Counter,但其初始化的时间劣势比收容测试的时间优势更差。
  • 我这样做~10 ^ 6次,所以我应该在cython
  • 中这样做

任何想法或指示都会很好 - 谢谢! (抱歉先过早点击发布按钮...)

1 个答案:

答案 0 :(得分:0)

# edit: second attempt in response to Bakuriu's comment
#
from collections import Counter
from itertools import groupby
multiset = Counter(sorted(vals)) # only create one Counter object
for it in ...:
    grouped = groupby(sorted(it))
    if all(len(list(g)) <= multiset[k] for k, g in grouped):
        # do stuff



from operator import eq
# if you are using Python 2
from itertools import izip as zip
from itertools import starmap

vals = sorted(my_vals)
for it in ...:
    test_vals = sorted(it)
    zipped = zip(test_vals, vals)
    # check if test_vals' multiset is contained 
    # in vals' multiset but bale out as soon as
    # non-matching values are found.
    if all(starmap(eq, zipped)):
        # do stuff