我需要在循环内多次计算二进制数的CRC16。我使用了以下方法
import numpy as np
import binascii
#I have just filled the array with random numbers
#These arrays are loaded from a file
array1=np.random.randint(0,511, size=100000)
array2=np.random.randint(0,511, size=100000)
#...
#This goes on to till say array100
#Now calculate crc of each row in a loop
for j in range(100000):
crc=0xffff
#Convert the number to binary 16 bit format
temp_bin=np.binary_repr(array1[j], 16)
crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
#Similarly for array2
temp_bin=np.binary_repr(array2[j], 16)
crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
#...
#This goes on till array100
虽然这种方法效果很好,但速度非常慢。在分析时,我发现将每个数字转换为二进制是我代码中的主要瓶颈。
总时间:10.9712秒
文件:speedup.py
功能:第7行的abc
线_____命中____时间____命中____%时间____线内容
7 @profile
8 def abc():
9 #I have just filled the array with random numbers
10 #Thse arrays are loaded from a file
11 1 3269.0 3269.0 0.0 array1=np.random.randint(0,511, size=100000)
12 1 3206.0 3206.0 0.0 array2=np.random.randint(0,511, size=100000)
13 #...
14 #This goes on to till say array100
15 #Now calculate crc of each row in a loop
16 100001 237461.0 2.4 2.2 for j in range(100000):
17 100000 199887.0 2.0 1.8 crc=0xffff
18 #Convert the number to binary 16 bit format
19 100000 3436116.0 34.4 31.3 temp_bin=np.binary_repr(array1[j], 16)
20 100000 1039049.0 10.4 9.5 crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
21 100000 793751.0 7.9 7.2 crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
22 ##Similarly for array2
23 100000 3423862.0 34.2 31.2 temp_bin=np.binary_repr(array2[j], 16)
24 100000 991331.0 9.9 9.0 crc=binascii.crc_hqx(chr(int(temp_bin[0:8],2)), crc)
25 100000 843271.0 8.4 7.7 crc=binascii.crc_hqx(chr(int(temp_bin[8:16],2)), crc)
我无法想出一个可以避免它的替代解决方案。那么有更高效和pythonic的方法将数字转换为二进制或完成这一切吗?
答案 0 :(得分:0)
查看代码,您可以绕过将字符串发送到字符串并返回。特别是因为你用零填充8位二进制数组到16位,只能将它再分成两半。相反,尝试:
zb = np.zeros(1, dtype=np.uint8)[0].tobytes()
for j in range(100000):
crc=0xffff
tmp_data = array1[j].tobytes()
crc=binascii.crc_hqx(zb, crc)
crc=binascii.crc_hqx(tmp_data, crc)
tmp_data = array2[j].tobytes()
crc=binascii.crc_hqx(zb, crc)
crc=binascii.crc_hqx(tmp_data, crc)
答案 1 :(得分:0)
最后我发现了一种更快的方法。我们可以巧妙地使用位运算符,而不是首先将数字转换为二进制数。这个实现速度快三倍。
import numpy as np
import binascii
#I have just filled the array with random numbers
#These arrays are loaded from a file
array1=np.random.randint(0,511, size=100000)
array2=np.random.randint(0,511, size=100000)
#...
#This goes on to till say array100
#Now calculate crc of each row in a loop
for j in range(100000):
crc=0xffff
#Convert the number to binary 16 bit format
crc=binascii.crc_hqx(chr(array1[j] >> 8), crc)
crc=binascii.crc_hqx(chr(array1[j] & 255), crc)
#Similarly for array2
crc=binascii.crc_hqx(chr(array2[j] >> 8), crc)
crc=binascii.crc_hqx(chr(array2[j] & 255), crc)
#...
#This goes on till array100
使用line profiler进行比较表明,此方法计算CRC的速度超过三倍:
总时间:2.66351 s
文件:speedup1.py
功能:第4行的abc
4 @profile
5 def abc():
6 #I have just filled the array with random numbers
7 #These arrays are loaded from a file
8 1 1204.0 1204.0 0.0 array1=np.random.randint(0,511, size=100000)
9 1 1207.0 1207.0 0.0 array2=np.random.randint(0,511, size=100000)
10 #...
11 #This goes on to till say array100
12 #Now calculate crc of each row in a loop
13 100001 93020.0 0.9 3.5 for j in range(100000):
14 100000 83277.0 0.8 3.1 crc=0xffff
15 #Convert the number to binary 16 bit format(This is the old method)
16 100000 1280059.0 12.8 48.1 temp_bin=np.binary_repr(array1[j], 16)
17 100000 351190.0 3.5 13.2 crc=binascii.crc_hqx(chr(array1[j] >> 8), crc)
18 100000 299711.0 3.0 11.3 crc=binascii.crc_hqx(chr(array1[j] & 255), crc)
19 #Similarly for array2(This is the new method using bit operators)
20 100000 276893.0 2.8 10.4 crc=binascii.crc_hqx(chr(array2[j] >> 8), crc)
21 100000 276946.0 2.8 10.4 crc=binascii.crc_hqx(chr(array2[j] & 255), crc)
答案 2 :(得分:0)
使用crcmod。它将为指定的CRC生成有效的代码。