如何比较重复但仅在不同列表中的项目列表(包含N个项目)

时间:2012-05-11 19:14:30

标签: python

经过多年的努力,我又回到了编程领域,而且我很难解决这个问题。用户正在定义分类集合量,但RasterValue列表的长度为N

实施例:     ClassificationSet =(1,2,3,4,5,6)#dedefined by user
    RasterVal =()#length = N

我使用ClassificationSet将项目存储到RasterVal中作为索引:     RasterVal(ClassificationSet).add(导入值)

RasterVal(1) = 22,23,24,25,23,23,22
RasterVal(2) = 22,30,31,32,30,30
RasterVal(3) = 31

是:RasterVal([],[22,23,24,25,23,23,22],[22,30,31,32,30,30],[31])

我想列出重复的值,但只有在不同的集合中重复这些值,而不是相同的值。

输出应该是:     RepeatSet = 22,31

非常感谢您的帮助。我已经能够比较这些集合,但它列出了重复的值,即使它们出现在同一个集合列表中。

3 个答案:

答案 0 :(得分:4)

@lukecampbell是对的:

>>> lsts = [[22,23,24,25,23,23,22],[22,30,31,32,30,30],[31]]
>>> from collections import Counter
>>> c = Counter(x for lst in lsts for x in set(lst))
>>> [x for x,y in c.items() if y > 1]
[22, 31]

此算法的运行时间是线性的,而不是集合数的二次方。

答案 1 :(得分:2)

RasterVal = [[22,23,24,25,23,23,22],[22,30,31,32,30,30],[31]]

from itertools import combinations
out = set()
for a,b in combinations(RasterVal,2):
    out = out|(set(a)&set(b))

print out
#output:
set([22, 31])

答案 2 :(得分:0)

#-- Your initial data...
rasterVals = [
    (22,23,24,25,23,23,22),
    (22,30,31,32,30,30),
    (31,)
]

#-- This will be a set, which holds only distinct values.
repeated = set( )

#-- We set up a classic inner-outer loop to iterate over all 
#-- pairs of lists in your given data.  (Note, there are better ways
#-- of doing this in Python, but this is the easiest to read and understand)
for a in rasterVals:
    #-- First, we'll make a set from the first element.
    #-- Since sets only hold distinct values, any repeated numbers
    #-- in the sequence will be ignored.
    s = set(a)

    for b in rasterVals:
        #-- Test that we are not comparing a sequence to itself.
        if (a is not b):
            #-- Make a set for the inner loop. (again, only distinct numbers...)
            t = set(b)

            #-- Get the intersection of s and t, answering with only those elements
            #-- that are in both s and t.
            intersecting = (s & t)

            #-- We update our set of (inter-sequence) repeated elements.
            repeated.update(intersecting)

#-- Show the results.
print repeated