l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r']]
l2 = [1, 1, 1, 2, 0, 0, 0]
我有两个如上所示的列表。 l1
是一个列表列表,l2
是另一个列表,其中包含某种分数。
问题:对于l1
中评分为0
(来自l2
)的所有列表,请找到完全不同或长度最短的列表。
例如:如果我的列表为[1, 2, 3]
,[2, 3]
,[5, 7]
所有得分为0,我会选择[5, 7]
,因为这些元素不存在于任何其他元素中列表和[2, 3]
,因为它与[1, 2, 3]
有一个交集,但长度较短。
我现在如何做到这一点:
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
else:
usable.append(j)
un_usable.append(i)
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
并且最终给了我:
[(['a', 'b', 'c'], 1),
(['a', 'd', 'c'], 1),
(['a', 'e'], 1),
(['a', 'd', 'c'], 2),
(['a', 'e'], 0),
(['p', 'q', 'r'], 0)]
这是必需的结果。
编辑:处理相同的长度:
l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r'],
['a', 'k']]
l2 = [1, 1, 1, 2, 0, 0, 0, 0]
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
elif len(i) == len(j):
usable.append(i)
usable.append(j)
else:
usable.append(j)
un_usable.append(i)
usable = [list(x) for x in set(tuple(x) for x in usable)]
un_usable = [list(x) for x in set(tuple(x) for x in un_usable)]
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
有更好,更快和更好吗? pythonic实现相同的方式?
答案 0 :(得分:1)
假设我理解正确,这是一个O(N)双遍算法。
步骤:
def select_lsts(lsts, scores):
# pick out zero score lists
z_lsts = [lst for lst, score in zip(lsts, scores) if score == 0]
# keep track of the shortest length of any list in which an element occurs
len_shortest = dict()
for lst in z_lsts:
ln = len(lst)
for c in lst:
len_shortest[c] = min(ln, len_shortest.get(c, float('inf')))
# check if the list is of minimum length for each of its chars
for lst in z_lsts:
len_lst = len(lst)
if any(len_shortest[c] < len_lst for c in lst):
continue
yield lst