如何在Python中使用集来查找列表成员资格?

时间:2013-01-15 07:02:41

标签: python set

假设:

A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

我想知道A中是否存在['Yes', X, 'No'],其中X是我不在乎的任何内容。

我试过了:

valid = False
for n in A:
    if n[0] == 'Yes' and n[2] == 'No':
        valid = True

我知道set()在这种情况下非常有用。但是怎么做呢?这可能吗?或者我最好坚持使用原始代码?

4 个答案:

答案 0 :(得分:8)

如果您想要检查是否存在,您可以['Yes', 'No'] in A

In [1]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [2]: ['Yes', 'No'] in A
Out[2]: True

下一个案例试试:

In [3]: A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

In [4]: any(i[0]=='Yes' and i[2] == 'No' for i in A)
Out[4]: True

或者您可以定义一个小函数:

In [5]: def want_to_know(l,item):
   ...:     for i in l:
   ...:         if i[0] == item[0] and i[2] == item[2]:
   ...:             return True
   ...:     return False

In [6]: want_to_know(A,['Yes', 'xxx', 'No'])
Out[6]: True

any(i[0]=='Yes' and i[2] == 'No' for i in A*10000)实际上似乎比转换本身快10倍。

In [8]: %timeit any({(x[0],x[-1]) == ('Yes','No') for x in A*10000})
100 loops, best of 3: 14 ms per loop

In [9]: % timeit {tuple([x[0],x[-1]]) for x in A*10000}
10 loops, best of 3: 33.4 ms per loop

In [10]: %timeit any(i[0]=='Yes' and i[2] == 'No' for i in A*10000)
1000 loops, best of 3: 334 us per loop

答案 1 :(得分:3)

首先将list转换为set,因为这会缩短从O(n)O(1)的查找时间:

In [27]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [28]: s=set(tuple(map(tuple,A)))

In [29]: s
Out[29]: set([('Yes', 'No'), ('No', 'Idontknow'), ('Yes', 'Idontknow'), ('No', 'Yes')])

In [30]: ('Yes', 'No') in s
Out[30]: True

timeit比较:

%timeit ['Yes', 'No'] in A
1000000 loops, best of 3: 504 ns per loop  

%timeit ('Yes', 'No') in s
1000000 loops, best of 3: 442 ns per loop       #winner

%timeit ['No', 'Idontknow'] in A
1000000 loops, best of 3: 861 ns per loop

%timeit ('No', 'Idontknow') in s
1000000 loops, best of 3: 461 ns per loop       #winner

修改

如果您只对第一个和最后一个元素感兴趣:

In [69]: A = [['Yes', 'No'], ['Yes', 'Idontknow','hmmm'], ['No', 'Yes'], ['No', 'Idontknow']]

In [70]: s={tuple([x[0],x[-1]]) for x in A} # -1 or 2, change as per your requirement
                                         #or set(tuple([x[0],x[-1]]) for x in A)


In [71]: s
Out[71]: set([('Yes', 'No'), ('Yes', 'hmmm'), ('No', 'Idontknow'), ('No', 'Yes')])

In [73]: ('Yes', 'hmmm') in s
Out[73]: True

timeitany()进行比较:

In [77]: %timeit ('Yes', 'hmmm') in s
1000000 loops, best of 3: 428 ns per loop      #winner

In [78]: %timeit any(x[0]=="Yes" and x[-1]=="hmmm" for x in A)
100000 loops, best of 3: 2.87 us per loop

答案 2 :(得分:0)

Set不支持列表,您可以将其转换为元组,

A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]
valid = ('Yes', 'No') in {tuple(item) for item in A}

并且正如@ IgnacioVazquez-Abrams所提到的,从列表到元组的转换是O(n),所以如果你知道性能,你需要选择其他方法。

答案 3 :(得分:0)

以下是如何使用Set()。

>>> A = Set([('Yes', 'No'), ('Yes', 'Idontknow'), ('No', 'Yes'), ('No', 'Idontknow')])
>>> ('Yes','No') in A
True
>>> 

Set的元素应该是hashable ..所以我使用元组作为Set元素而不是列表。