我有四组数据:
A=range(10,20)
B=range(5,17)
C=range(15,25)
D=range(18,30)
sets = [A, B, C, D]
我想要做的是获得可以的交集的内容 被视为获取维恩图的所有部分(这是完整的案例):
使用上面的示例,分区填充如下:
() ----> set()
('A',) ----> set()
('B',) ----> {8, 9, 5, 6, 7}
('C',) ----> set()
('D',) ----> {25, 26, 27, 28, 29}
('A', 'B') ----> {10, 11, 12, 13, 14}
('A', 'C') ----> {17}
('A', 'D') ----> set()
('B', 'C') ----> set()
('B', 'D') ----> set()
('C', 'D') ----> {24, 20, 21, 22, 23}
('A', 'B', 'C') ----> {16, 15}
('A', 'B', 'D') ----> set()
('A', 'C', 'D') ----> {18, 19}
('B', 'C', 'D') ----> set()
('A', 'B', 'C', 'D') ----> set()
这些是预期的答案。
我坚持使用下面的代码,只能找到必须的交集 存在于所有给定的集合中:
# only gives ACD members
test = [tuple([A[0],A[-1]]), tuple([C[0],C[-1]]), tuple([D[0],D[-1]])]
starts, ends = zip(*test)
result = range(max(starts), min(ends) + 1)
# Gives 18,19
这样做的方法是什么? 请注意,我对绘图图表不感兴趣。 让我担心的是获得每个细分的成员。
答案 0 :(得分:1)
我在这里用解决方案写了一篇关于此类问题的博客:http://paddy3118.blogspot.de/2013/07/set-divisionspartitions.html
您需要将x..y语法扩展为整数集,但如果这种形式的输出对您有用,那么您可能希望将输出与这种函数接口:http://rosettacode.org/wiki/Range_extraction
P.S。这是一个漂亮的维恩图。
答案 1 :(得分:1)
最好使用具有线性复杂度的扫描线算法(好,再加上输出的长度),而不是指数。
A=range(10,20)
B=range(5,17)
C=range(15,25)
D=range(18,30)
sets = [A, B, C, D]
import string
events = []
for letter, set_ in zip(string.ascii_uppercase, sets):
events.append((set_.start, True, letter))
events.append((set_.stop, False, letter))
events.sort()
intersection = set()
intersections = []
last_t = None
for t, insert, letter in events:
if t != last_t and intersection:
intersections.append((''.join(sorted(intersection)), range(last_t, t)))
last_t = t
if insert:
intersection.add(letter)
else:
intersection.remove(letter)
print(intersections)
答案 2 :(得分:1)
import itertools
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
A = set(range(10,20))
B = set(range(5,17))
C = set(range(15,25))
D = set(range(18,30))
titles = (partition for partition in powerset(['A', 'B', 'C', 'D']))
source = (partition for partition in powerset([A, B, C, D]))
for elt in (zip(titles, source)):
try:
res = elt[1][0]
for el in elt[1]:
res.intersection(el)
except IndexError:
pass
print(elt[0], ' = ', res)
输出=每组之间的交叉点
() = {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29}
('A',) = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('B',) = {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
('C',) = {15, 16, 17, 18, 19, 20, 21, 22, 23, 24}
('D',) = {18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29}
('A', 'B') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('A', 'C') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('A', 'D') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('B', 'C') = {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
('B', 'D') = {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
('C', 'D') = {15, 16, 17, 18, 19, 20, 21, 22, 23, 24}
('A', 'B', 'C') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('A', 'B', 'D') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('A', 'C', 'D') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
('B', 'C', 'D') = {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
('A', 'B', 'C', 'D') = {10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
答案 3 :(得分:1)
这是:输出是属于每个分区的元素集。它适用于任意数量的集合。
import itertools
def intersect(d):
"""
d is an iterable collection of sets or frozensets
returns the intersection of the sets in d"
"""
res = set()
try:
res = set(d[0])
except IndexError:
pass
for elt in d:
elt = set(elt)
res = res.intersection(elt)
return res
A = frozenset(range(10,20))
B = frozenset(range(5,17))
C = frozenset(range(15,25))
D = frozenset(range(18,30))
titles = ('A','B','C','D')
data = (A, B, C, D)
dataset = set(data)
titles_comb, data_comb = [], []
for n in range(len(data)+1):
titles_comb.append(list(itertools.combinations(titles, n)))
data_comb.append(list(itertools.combinations(data, n)))
for title, dat in zip(titles_comb, data_comb):
for t, d in zip(title, dat):
#intersect(d) = elements in the intersection of the sets (what we want, but has overlap)
#complement = sets from data that were not used in intersect(d) (the overlap we want to discard)
result = intersect(d)
complement = dataset.difference(set(d))
comp = set()
for elt in complement:
for e in elt:
comp.add(e)
print(t, "\t---->", result.difference(comp))
输出=每个分区的内容(不包括所有其他分区)
() ----> set()
('A',) ----> set()
('B',) ----> {8, 9, 5, 6, 7}
('C',) ----> set()
('D',) ----> {25, 26, 27, 28, 29}
('A', 'B') ----> {10, 11, 12, 13, 14}
('A', 'C') ----> {17}
('A', 'D') ----> set()
('B', 'C') ----> set()
('B', 'D') ----> set()
('C', 'D') ----> {24, 20, 21, 22, 23}
('A', 'B', 'C') ----> {16, 15}
('A', 'B', 'D') ----> set()
('A', 'C', 'D') ----> {18, 19}
('B', 'C', 'D') ----> set()
('A', 'B', 'C', 'D') ----> set()
答案 4 :(得分:-1)
您是否尝试过使用python sets?
JSON