Question

l1 = [['a', 'b', 'c'],
      ['a', 'd', 'c'],
      ['a', 'e'],
      ['a', 'd', 'c'],
      ['a', 'f', 'c'],
      ['a', 'e'],
      ['p', 'q', 'r']]

l2 = [1, 1, 1, 2, 0, 0, 0]

我有两个如上所示的列表。 l1是一个列表列表，l2是另一个列表，其中包含某种分数。

问题：对于l1中评分为0（来自l2）的所有列表，请找到完全不同或长度最短的列表。

例如：如果我的列表为[1, 2, 3]，[2, 3]，[5, 7]所有得分为0，我会选择[5, 7]，因为这些元素不存在于任何其他元素中列表和[2, 3]，因为它与[1, 2, 3]有一个交集，但长度较短。

我现在如何做到这一点：

l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))

un_usable = []
usable = []
for i, j in c:
    intersection = len(set(i).intersection(set(j)))
    if intersection > 0:
        if len(i) < len(j):
            usable.append(i)
            un_usable.append(j)
        else:
            usable.append(j)
            un_usable.append(i)

for i, j in c:
    intersection = len(set(i).intersection(set(j)))
    if intersection == 0:
        if i not in un_usable and i not in usable:
            usable.append(i)
        if j not in un_usable and j not in usable:
            usable.append(j)            

final = lx + [(x, 0) for x in usable]

并且最终给了我：

[(['a', 'b', 'c'], 1),
 (['a', 'd', 'c'], 1),
 (['a', 'e'], 1),
 (['a', 'd', 'c'], 2),
 (['a', 'e'], 0),
 (['p', 'q', 'r'], 0)]

这是必需的结果。

编辑：处理相同的长度：

l1 = [['a', 'b', 'c'],
      ['a', 'd', 'c'],
      ['a', 'e'],
      ['a', 'd', 'c'],
      ['a', 'f', 'c'],
      ['a', 'e'],
      ['p', 'q', 'r'],
      ['a', 'k']]

l2 = [1, 1, 1, 2, 0, 0, 0, 0]     

l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
    intersection = len(set(i).intersection(set(j)))
    if intersection > 0:
        if len(i) < len(j):
            usable.append(i)
            un_usable.append(j)
        elif len(i) == len(j):
            usable.append(i)
            usable.append(j)
        else:
            usable.append(j)
            un_usable.append(i)

usable = [list(x) for x in set(tuple(x) for x in usable)]
un_usable = [list(x) for x in set(tuple(x) for x in un_usable)]

for i, j in c:
    intersection = len(set(i).intersection(set(j)))
    if intersection == 0:
        if i not in un_usable and i not in usable:
            usable.append(i)
        if j not in un_usable and j not in usable:
            usable.append(j)            

final = lx + [(x, 0) for x in usable]

有更好，更快和更好吗？ pythonic实现相同的方式？

Answer 1

假设我理解正确，这是一个O（N）双遍算法。

步骤：

选择零得分列表。
对于每个零分数列表的每个元素，找到元素出现的最短零分数列表的长度。我们称之为元素的长度分数。
对于每个列表，找到列表中所有元素的最小长度分数。如果结果小于列表的长度，则丢弃该列表。

def select_lsts(lsts, scores):
    # pick out zero score lists
    z_lsts = [lst for lst, score in zip(lsts, scores) if score == 0]

    # keep track of the shortest length of any list in which an element occurs
    len_shortest = dict()
    for lst in z_lsts:
        ln = len(lst)
        for c in lst:
            len_shortest[c] = min(ln, len_shortest.get(c, float('inf')))

    # check if the list is of minimum length for each of its chars
    for lst in z_lsts:
        len_lst = len(lst)
        if any(len_shortest[c] < len_lst for c in lst):
            continue

        yield lst

根据长度和交集从列表列表中选择元素

1 个答案: