x = [8,2,3,4,5]
y = [6,3,7,2,1]
如何以简洁优雅的方式找出两个列表中的第一个公共元素(在本例中为“2”)?任何列表都可以是空的,或者没有共同的元素 - 在这种情况下,无可以。
我需要这个向其中的新手展示python,所以越简单越好。
UPD:顺序对我的目的并不重要,但我们假设我正在寻找x中也出现在y中的第一个元素。
答案 0 :(得分:9)
这应该是直接的,几乎和它一样有效(更有效的解决方案检查Ashwini Chaudharys answer和最有效的检查jamylaks answer和评论):
result = None
# Go trough one array
for i in x:
# The element repeats in the other list...
if i in y:
# Store the result and break the loop
result = i
break
或者更优雅的事件是将相同的功能封装到函数 using PEP 8 like coding style conventions :
def get_first_common_element(x,y):
''' Fetches first element from x that is common for both lists
or return None if no such an element is found.
'''
for i in x:
if i in y:
return i
# In case no common element found, you could trigger Exception
# Or if no common element is _valid_ and common state of your application
# you could simply return None and test return value
# raise Exception('No common element found')
return None
如果您想要所有常见元素,您可以这样做:
>>> [i for i in x if i in y]
[1, 2, 3]
答案 1 :(得分:8)
排序不是最快的方法,这可以在O(N)时间内使用set(哈希映射)完成。
>>> x = [8,2,3,4,5]
>>> y = [6,3,7,2,1]
>>> set_y = set(y)
>>> next((a for a in x if a in set_y), None)
2
或者:
next(ifilter(set(y).__contains__, x), None)
这就是它的作用:
>>> def foo(x, y):
seen = set(y)
for item in x:
if item in seen:
return item
else:
return None
>>> foo(x, y)
2
为了显示不同方法之间的时差(天真方法,二元搜索一组),这里有一些时间。我不得不这样做,以反驳那些认为二进制搜索速度更快的人:...
from itertools import ifilter
from bisect import bisect_left
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
def common_set(x, y, ifilter=ifilter, set=set, next=next):
return next(ifilter(set(y).__contains__, x), None)
pass
def common_b_sort(x, y, bisect=bisect_left, sorted=sorted, min=min, len=len):
sorted_y = sorted(y)
for a in x:
if a == sorted_y[min(bisect_left(sorted_y, a),len(sorted_y)-1)]:
return a
else:
return None
def common_naive(x, y):
for a in x:
for b in y:
if a == b: return a
else:
return None
from timeit import timeit
from itertools import repeat
import threading, thread
print 'running tests - time limit of 20 seconds'
for x, y in [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h'), ('i', 'j')]:
for func in ('common_set', 'common_b_sort', 'common_naive'):
try:
timer = threading.Timer(20, thread.interrupt_main) # 20 second time limit
timer.start()
res = timeit(stmt="print '[', {0}({1}, {2}), ".format(func, x, y),
setup='from __main__ import common_set, common_b_sort, common_naive, {0}, {1}'.format(x, y),
number=1)
except:
res = "Too long!!"
finally:
print '] Function: {0}, {1}, {2}. Time: {3}'.format(func, x, y, res)
timer.cancel()
测试数据是:
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
结果:
running tests - time limit of 20 seconds
[ 9 ] Function: common_set, a, b. Time: 0.00569520707241
[ 9 ] Function: common_b_sort, a, b. Time: 0.0182240340602
[ 9 ] Function: common_naive, a, b. Time: 0.00978832505249
[ 7 ] Function: common_set, c, d. Time: 0.249175872911
[ 7 ] Function: common_b_sort, c, d. Time: 1.86735751332
[ 7 ] Function: common_naive, c, d. Time: 0.264309220865
[ 40000 ] Function: common_set, e, f. Time: 0.00966861710078
[ 40000 ] Function: common_b_sort, e, f. Time: 0.0505980508696
[ ] Function: common_naive, e, f. Time: Too long!!
[ None ] Function: common_set, g, h. Time: 1.11300018578
[ None ] Function: common_b_sort, g, h. Time: 14.9472068377
[ ] Function: common_naive, g, h. Time: Too long!!
[ 5411743 ] Function: common_set, i, j. Time: 1.88894859542
[ 5411743 ] Function: common_b_sort, i, j. Time: 6.28617268396
[ 5411743 ] Function: common_naive, i, j. Time: 1.11231867458
这让您了解它如何扩展到更大的输入,O(N)对O(N log N)对O(N ^ 2)
答案 2 :(得分:6)
一个班轮:
x = [8,2,3,4,5]
y = [6,3,7,2,1]
first = next((a for a in x if a in y), None)
或更有效率:
set_y = set(y)
first = next((a for a in x if a in set_y), None)
或者更有效但仍然在一行(不要这样做):
first = next((lambda set_y: a for a in x if a in set_y)(set(y)), None)
答案 3 :(得分:3)
将for
循环与in
一起使用会导致O(N^2)
复杂度,但您可以在此处对y
进行排序,并使用二进制搜索将时间复杂度提高到{{} 1}}。
O(NlogN)
输出:
def binary_search(lis,num):
low=0
high=len(lis)-1
ret=-1 #return -1 if item is not found
while low<=high:
mid=(low+high)//2
if num<lis[mid]:
high=mid-1
elif num>lis[mid]:
low=mid+1
else:
ret=mid
break
return ret
x = [8,2,3,4,5]
y = [6,3,7,2,1]
y.sort()
for z in x:
ind=binary_search(y,z)
if ind!=-1
print z
break
使用2
模块执行与上述相同的操作:
bisect
答案 4 :(得分:3)
我认为你想教这个人Python,而不仅仅是编程。因此,我毫不犹豫地使用zip
而不是丑陋的循环变量;它是Python中非常有用的部分,不难解释。
def first_common(x, y):
common = set(x) & set(y)
for current_x, current_y in zip(x, y):
if current_x in common:
return current_x
elif current_y in common:
return current_y
print first_common([8,2,3,4,5], [6,3,7,2,1])
如果您真的不想使用zip
,请按以下步骤操作:
def first_common2(x, y):
common = set(x) & set(y)
for i in xrange(min(len(x), len(y))):
if x[i] in common:
return x[i]
elif y[i] in common:
return y[i]
对于那些感兴趣的人,这就是它如何扩展到任意数量的序列:
def first_common3(*seqs):
common = set.intersection(*[set(seq) for seq in seqs])
for current_elements in zip(*seqs):
for element in current_elements:
if element in common:
return element
最后,请注意,与其他一些解决方案相比,如果第一个公共元素首先出现在第二个列表中,这也可以。
我刚刚注意到您的更新,这使得解决方案变得更加简单:
def first_common4(x, y):
ys = set(y) # We don't want this to be recreated for each element in x
for element in x:
if element in ys:
return element
以上可以说比生成器表达式更具可读性。
太糟糕了,没有内置的有序集。它本可以提供更优雅的解决方案。
答案 5 :(得分:1)
使用for循环似乎最容易向新人解释。
for number1 in x:
for number2 in y:
if number1 == number2:
print number1, number2
print x.index(number1), y.index(number2)
exit(0)
print "No common numbers found."
NB没有经过测试,只是出于我的想法。
答案 6 :(得分:1)
这个使用套装。它返回第一个公共元素,如果没有公共元素则返回None。
def findcommon(x,y):
common = None
for i in range(0,max(len(x),len(y))):
common = set(x[0:i]).intersection(set(y[0:i]))
if common: break
return list(common)[0] if common else None
答案 7 :(得分:1)
def first_common_element(x,y):
common = set(x).intersection(set(y))
if common:
return x[min([x.index(i)for i in common])]
答案 8 :(得分:1)
只是为了好玩(可能效率不高),另一个使用itertools
的版本:
from itertools import dropwhile, product
from operator import __ne__
def accept_pair(f):
"Make a version of f that takes a pair instead of 2 arguments."
def accepting_pair(pair):
return f(*pair)
return accepting_pair
def get_first_common(x, y):
try:
# I think this *_ unpacking syntax works only in Python 3
((first_common, _), *_) = dropwhile(
accept_pair(__ne__),
product(x, y))
except ValueError:
return None
return first_common
x = [8, 2, 3, 4, 5]
y = [6, 3, 7, 2, 1]
print(get_first_common(x, y)) # 2
y = [6, 7, 1]
print(get_first_common(x, y)) # None
使用lambda pair: pair[0] != pair[1]
代替accept_pair(__ne__)
更简单,但不那么有趣。
答案 9 :(得分:0)
使用set - 这是任意数量列表的通用解决方案:
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
firsts = [min(lst.index(el) for el in common) for lst in lsts]
index_in_list = min(firsts)
trgt_lst_index = firsts.index(index_in_list)
return lsts[trgt_lst_index][index_in_list]
事后的想法 - 不是一个有效的解决方案,这个减少了多余的开销
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
for lsts_slice in itertools.izip_longest(*lsts):
slice_intersection = common.intersection(lsts_slice)
if slice_intersection:
return slice_intersection.pop()