Question

我正在寻找能够映射属于长度n列表的子列表中所有元素之间所有关系的算法。

更具体地说，假设a，b，c，d，e和f是工人的名字，子列表代表一个“转移”。发生在昨天。我想知道，对于每个与昨天一起工作的工人来说。

shifts_yesterday = [[a, b, c, d], [b, c, e, f]]

目标：

a: b, c, d
b: a, c, d, e, f
c: a, b, d, e, f
d: a, b, c
e: b, c, f
f: b, c, e

上面，我可以看到a昨天与b, c, d合作; b昨天与a, c, d, e, f等合作

时间复杂性是一个问题，因为我有一个大的列表要处理。虽然，直觉上，我怀疑这个地板有相当高的底线......

注意：我显然可以只用for循环编写~~线性搜索~~直接方法，但这是（a）不是很聪明（b）非常慢。

编辑：

这里（一个混乱的）尝试：

shifts = [['a', 'b', 'c', 'd'], ['b', 'c', 'e', 'f']]
workers = [i for s in shifts for i in s]

import collections
d = collections.defaultdict(list)

for w in workers:
    for s in shifts:
        for i in s:
            if i != w and w in s:
                if w in d.keys():
                    if i not in d[w]:
                        d[w].append(i)
                else:
                    d[w].append(i)

测试：

for k, v in collections.OrderedDict(sorted(d.items())).items():
    print(k, v)

编辑2：

时间：

我的：%%timeit -r 10 - ＆gt; 10000 loops, best of 10: 19 µs per loop
Padraic Cunningham：%%timeit -r 10 - ＆gt; 100000 loops, best of 10: 4.89 µs per loop
zvone：%%timeit -r 10 - ＆gt; 100000 loops, best of 10: 3.88 µs per loop
气动：%%timeit -r 10 - ＆gt; 10000 loops, best of 10: 33.5 µs per loop

Answer 1

result = defaultdict(set)

for shift in shifts:
    for worker in shift:
        result[worker].update(shift)

# now, result[a] contains: a, b, c, d - so remove the a

for k, v in result.iteritems():
    v.remove(k)

Answer 2

使用存储值的集合和 itertools.combinations 来配置工作人员的简化且更有效的自己代码版本：

shifts = [['a', 'b', 'c', 'd'], ['b', 'c', 'e', 'f']]


from itertools import combinations
import collections

d = collections.defaultdict(set)
for sub in shifts:
    for a, b in combinations(sub, 2):
        d[a].add(b)
        d[b].add(a)

for k, v in sorted(d.items()):
print(k, v)

哪会给你：

('a', set(['c', 'b', 'd']))
('b', set(['a', 'c', 'e', 'd', 'f']))
('c', set(['a', 'b', 'e', 'd', 'f']))
('d', set(['a', 'c', 'b']))
('e', set(['c', 'b', 'f']))
('f', set(['c', 'b', 'e']))

在您的小样本输入上：

In [1]: import collections

In [2]: %%timeit
   ...: shifts = [['a', 'b', 'c', 'd'], ['b', 'c', 'e', 'f']]
   ...: workers = [i for s in shifts for i in s]
   ...: d = collections.defaultdict(list)
   ...: for w in workers:
   ...:     for s in shifts:
   ...:         for i in s:
   ...:             if i != w and w in s:
   ...:                 if w in d.keys():
   ...:                     if i not in d[w]:
   ...:                         d[w].append(i)
   ...:                 else:
   ...:                     d[w].append(i)
   ...: 
10000 loops, best of 3: 21.6 µs per loop

In [3]: from itertools import combinations

In [4]: %%timeit
   ...: shifts = [['a', 'b', 'c', 'd'], ['b', 'c', 'e', 'f']]
   ...: d = collections.defaultdict(set)
   ...: for sub in shifts:
   ...:     for a, b in combinations(sub, 2):
   ...:         d[a].add(b)
   ...:         d[b].add(a)
   ...: 
100000 loops, best of 3: 4.55 µs per loop

Answer 3

伪码算法：

declare two-dimensional array workers
for each shift in shifts_yesterday
    for each element x in shift
        add x to workers[x]
        for each element y != x in shift
            add y to workers[x]

for each list xs in workers
    print xs[0] + ": "
    for each element w in xs except the first
        print xs[w] + ", "

时间复杂度为O(n*m^2 + w*m)，其中n是班次数，m是任意班次中的最大工人数，w是工人总数。如果你能够满足于看到每个工人一次（不要同时显示a: b和b: a），你可以削减一个m。这是一个二次算法，我相信这是你能做到的最好的。

Answer 4

应该指定更多条件。例如，如果总共“shifting_yesterday”数组大小限制为64，则可以使用long类型为worker存储shift-bit。然后你可以通过单一操作回答这个问题：

a = 00000001  
b = 00000011  
d = 00000010  
f = 00000010

b可以用d吗？

((b & d) != 0) : true

与f一起工作吗？

((a & f) != 0) : false

Answer 5

我认为你正在寻找一套固定的会员关系。我们称之为coworkers：

shifts_yesterday = [['a', 'b', 'c', 'd'], ['b', 'c', 'e', 'f']]

def coworkers(worker, shifts):
    coworkers = set()
    coworkers.update( *[shift for shift in shifts if worker in shift] )
    return coworkers

对于每个工人，您创建一组包含工人的所有班次。

everybody = set()
everybody.update( *shifts_yesterday )

for worker in everybody:
     print("{}: {}".format(worker, coworkers(worker, shifts_yesterday)))

输出

a: set(['a', 'c', 'b', 'd'])
c: set(['a', 'c', 'b', 'e', 'd', 'f'])
b: set(['a', 'c', 'b', 'e', 'd', 'f'])
e: set(['c', 'b', 'e', 'f'])
d: set(['a', 'c', 'b', 'd'])
f: set(['c', 'b', 'e', 'f'])

映射列表列表中元素之间所有关系的过程

编辑：

编辑2：

5 个答案: