我试图找出最快的方法来计算在numpy列表中两个值一个接一个定位的次数。
例如:
list = [1, 5, 4, 1, 2, 4, 6, 7, 2, 1, 3, 3, 1, 2]
并且我想计算值1
跟在值2
之后的次数(反之亦然)
在上面的示例中,答案应为1
,因为1
仅跟随2
一次。
很明显,我可以通过一个简单的for循环来找到答案,每次当项i
等于1
和项i-1
等于2
时,计数器都会添加一个计数器,但我认为必须有一种更快的方法,
谢谢
答案 0 :(得分:5)
import numpy as np
mylist = [1, 5, 4, 1, 2, 4, 6, 7, 2, 1, 3, 3, 1, 2]
# Turn your list into a numpy array
myarray = np.array(mylist)
# find occurences where myarray is 2 and the following element is 2 minus 1
np.sum((myarray[:-1] == 2) & (np.diff(myarray) == -1))
哪个返回1
大型定时:
在一个很小的列表上,迭代方法和numpy
方法之间的时间差将不会明显。但是在大型数组上(如下例所示),numpy
的性能要好得多。
import timeit
mylist = np.random.choice(range(0,9), 1000000)
def np_method(mylist = mylist):
return np.sum((mylist[:-1] == 2) & (np.diff(mylist) == -1))
def zip_loop(a = mylist):
return len( [1 for i,j in zip(a, a[1:]) if i == 2 and j == 1] )
def for_loop(list1 = mylist):
count=0
desired_num=2
follower_num=1
for i in range(len(list1)-1):
if list1[i]==desired_num:
if list1[i+1]==follower_num:
count+=1
return count
>>> timeit.timeit(np_method, number = 100) / 100
0.006748438189970329
>>> timeit.timeit(zip_loop, number = 100) / 100
0.3811768989200209
>>> timeit.timeit(for_loop, number = 100) / 100
0.3774999916599336
答案 1 :(得分:1)
我能想到的最简单的方法是使用for循环
count=0
desired_num=2
follower_num=1
for i in range(len(list1)-1):
if list1[i]==desired_num:
if list1[i+1]==follower_num:
count+=1
print("total occurance=",count)
需要:我的机器上获取0.0003437995910644531s
答案 2 :(得分:0)
我建议您使用切片和理解来遍历您的输入列表,如下所示:
myList = [1, 5, 4, 1, 2, 4, 6, 7, 2, 1, 3, 3, 1, 2]
result = sum(myList[i:i+2] == [2,1] for i in range(len(myList)-1))
print(result) # 1
使用zip()
功能还可以帮助您:
myList = [1, 5, 4, 1, 2, 4, 6, 7, 2, 1, 3, 3, 1, 2]
result = sum((i,j) == (2,1) for (i,j) in zip(myList, myList[1:]))
print(result) # 1
答案 3 :(得分:-1)
您不应调用变量list
-它已在python中使用并且非常混乱。
>>> a = [1, 5, 4, 1, 2, 4, 6, 7, 2, 1, 3, 3, 1, 2]
>>> len( [1 for i,j in zip(a, a[1:]) if i == 2 and j == 1] )
1
基本上,您可以使用zip()
将数组放在其自身上,并处理数字对,以查找任何组合:
>>> zip(a, a[1:])
[(1, 5), (5, 4), (4, 1), (1, 2), (2, 4), (4, 6), (6, 7), (7, 2), (2, 1), (1, 3), (3, 3), (3, 1), (1, 2)]
答案 4 :(得分:-1)
仅出于乐趣,我已对所有4种主要解决方案进行了计时,结果如下:
#!/usr/bin/env python
import numpy as np
import random
def f1(li):
return np.sum((np.array(li[:-1]) == 2) & (np.diff(li) == -1))
def f2(li):
return sum((i,j) == (2,1) for (i,j) in zip(li, li[1:]))
def f3(li):
count=0
desired_num=2
follower_num=1
for i in range(len(li)-1):
if li[i]==desired_num:
if li[i+1]==follower_num:
count+=1
return count
def f4(li) :
return len( [1 for i,j in zip(li, li[1:]) if i == 2 and j == 1] )
if __name__=='__main__':
import timeit
import random
s = []
for i in range(10000000) :
s.append( random.randint(1,10) )
print f1(s), f2(s), f3(s), f4(s)
print(f1(s)==f2(s)==f3(s)==f4(s))
for f in (f1,f2,f3,f4):
print(" {:^10s}{:.4f} secs".format(f.__name__, timeit.timeit("f(s)", setup="from __main__ import f, s", number=10)))
'''
output:
100236 100236 100236 100236
True
f1 7.2285 secs
f2 13.7680 secs
f3 4.3167 secs
f4 7.7375 secs
'''
令人惊讶的是,简单的for
循环拍子numpy
=)