如何基于给定的元组列表删除列表列表?

时间:2018-12-07 08:14:01

标签: python python-3.x pyspark

我有两个类似下面的列表。

l=[['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C'], ['A', 'B'],['B','C']]
x=[('A', 'B'), ('A', 'C')]

我想从l中删除列表,而该列表在x中的元组列表中不存在。

例如,存在l ['B','c']列表,但是我们没有(B,C)在x中的组合,因此我们需要删除B,C这两个元素,而X中没有任何单个元组。  我的预期输出将是:

[['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C'], ['A', 'B']

4 个答案:

答案 0 :(得分:3)

您可以将常规list comprehensionallany一起使用:

>>> [l_ for l_ in l if any(all(e in l_ for e in x_) for x_ in x)]
[['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C'], ['A', 'B']]

答案 1 :(得分:2)

尽管以上所有工作都能完成,但是One的速度要快得多。 以下是每10万次运行每种溶液的测量时间(以秒为单位)。

  • schwobaseggl:0.80720812
  • long_for_loop:0.7031608010000001
  • shlomiLan 0.24393211999999997
  • Miklos_Horvath 0.683809444(尚未转换回列表)
如上所示,shlomiLan的速度几乎是任何其他解决方案的3倍。 用于获得结果的代码可以在这里看到:

import timeit
setup = """
l = [['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C'], ['A', 'B'],['B','C']]
x = [('A', 'B'), ('A', 'C')]"""

schwobaseggl = "[l_ for l_ in l if any(all(e in l_ for e in x_) for x_ in x)]"

long_for_loop = '''
c = []
for l_ in l:
    d = []
    for x_ in x:
        a = []
        for e in x_:
            if e in l_:
                a.append(True)
            else:
                a.append(False)
                break
        if all(a):
            d.append(True)
    if any(d):
        c.append(l_)
'''

shlomiLan = """new_list = []
for i in l:
    # Must use a flag, because we have 2 items that are the same in l (['A', 'B', 'C'])
    # so can't use append if i not new_list
    is_i_added = False
    for z in x:
        if is_i_added:
            continue

        j_not_in_i = False
        for j in z:
            if j not in i:
                j_not_in_i = True

        if not j_not_in_i:
            new_list.append(i)
            is_i_added = True"""

miklos_Horvath = "l = list(filter(lambda m: any(all(e in m for e in y) for y in x), l))"

a = timeit.timeit(setup=setup, stmt=schwobaseggl, number=100000)
b = timeit.timeit(setup=setup, stmt=long_for_loop, number=100000)
c = timeit.timeit(setup=setup, stmt=shlomiLan, number=100000)
d = timeit.timeit(setup=setup, stmt=miklos_Horvath, number=100000)

答案 2 :(得分:1)

对我来说,这对于一线来说很复杂。

new_list = []
for i in l:
    # Must use a flag, because we have 2 items that are the same in l (['A', 'B', 'C'])
    # so can't use append if i not new_list
    is_i_added = False
    for z in x:
        if is_i_added:
            continue

        j_not_in_i = False
        for j in z:
            if j not in i:
                j_not_in_i = True

        if not j_not_in_i:
            new_list.append(i)
            is_i_added = True

print(new_list)

答案 3 :(得分:0)

使用这样的过滤器:

l=[['A', 'B', 'C'], ['A', 'C'], ['A', 'B', 'C'], ['A', 'B'],['B','C']]
x=[('A', 'B'), ('A', 'C')]

l = list(filter(lambda m: any(all(e in m for e in y) for y in x), l))

print(l)