我正在尝试改进我的代码,将随机生成的数字排序到范围间隔中,以便分析随机数生成器的准确性。目前我的排序由20个elif语句执行(我只有python的入门知识),因此我的代码需要很长时间才能执行。如何更有效地将数值数据分类为间隔,并且只保存间隔中数字的频率?
from datetime import datetime
startTime = datetime.now()
def test_rand(points):
import random
d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20 = 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
# these variables will be used to count frequency of numbers into 20 intervals: (-10,-9], (-9,-8] ... etc
g1,g2,g3,g4,g5,g6,g7,g8,g9,g10,g11,g12,g13,g14,g15,g16,g17,g18,g19,g20 = 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
# these variables will be used to count frequency of every 20 numbers into 20 intervals: (-200,-180], (-180,-160] ... etc
y = 0
n = 0
for i in range(points):
x = random.uniform(-10.0,10.0)
while n < 20:
y += x
n += 1
break
if n == 20:
if y < -180:
g1 += 1
elif y < -160 and y > -180:
g2 += 1
elif y < -140 and y > -160:
g3 += 1
elif y < -120 and y > -140:
g4 += 1
elif y < -100 and y > -120:
g5 += 1
elif y < -80 and y > -100:
g6 += 1
elif y < -60 and y > -80:
g7 += 1
elif y < -40 and y > -60:
g8 += 1
elif y < -20 and y > -40:
g9 += 1
elif y < 0 and y > -20:
g10 += 1
elif y < 20 and y > 0:
g11 += 1
elif y < 40 and y > 20:
g12 += 1
elif y < 60 and y > 40:
g13 += 1
elif y < 80 and y > 60:
g14 += 1
elif y < 100 and y > 80:
g15 += 1
elif y < 120 and y > 100:
g16 += 1
elif y < 140 and y > 120:
g17 += 1
elif y < 160 and y > 140:
g18 += 1
elif y < 180 and y > 160:
g19 += 1
elif y > 180:
g20 += 1
y *= 0
n *= 0
if x < -9:
d1 += 1
elif x < -8 and x > -9:
d2 += 1
elif x < -7 and x > -8:
d3 += 1
elif x < -6 and x > -7:
d4 += 1
elif x < -5 and x > -6:
d5 += 1
elif x < -4 and x > -5:
d6 += 1
elif x < -3 and x > -4:
d7 += 1
elif x < -2 and x > -3:
d8 += 1
elif x < -1 and x > -2:
d9 += 1
elif x < 0 and x > -1:
d10 += 1
elif x < 1 and x > 0:
d11 += 1
elif x < 2 and x > 1:
d12 += 1
elif x < 3 and x > 2:
d13 += 1
elif x < 4 and x > 3:
d14 += 1
elif x < 5 and x > 4:
d15 += 1
elif x < 6 and x > 5:
d16 += 1
elif x < 7 and x > 6:
d17 += 1
elif x < 8 and x > 7:
d18 += 1
elif x < 9 and x > 8:
d19 += 1
elif x > 9:
d20 += 1
return d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20,g1,g2,g3,g4,g5,g6,g7,g8,g9,g10,g11,g12,g13,g14,g15,g16,g17,g18,g19,g20
print(test_rand(100000000))
print (datetime.now() - startTime)
代码用于使用随机数执行2个函数。第一种是将数字排序为20个区间(因此每个区间应该有5%的数字)。第二个是将每生成20个数字相加并将它们放入20个新的区间(应观察到正常曲线)
@tristan我已修改您的代码以执行上述操作:
for idx in range(points):
val_1 = uniform(-10, 10)
val_20 += val_1
if (idx + 1) % 20 == 0:
counter2[bisect(occ2, val_20)] += 1
counter1[bisect(occ1, val_1)] += 1
val_20 = 0
val_1 = 0
else:
counter1[bisect(occ1, val_1)] += 1
val_1 = 0
虽然这种方法只能节省6秒(1:54 - > 1:48),但它更有组织,更容易查看。谢谢你的帮助!
答案 0 :(得分:2)
假设数据总是可以分配给您的一个间隔(您可以预先检查),使用bisect.bisect()将是一种有效且紧凑的方式:
from bisect import bisect
from random import randint
occ1 = [-9 + 1 * i for i in range(19)]
occ2 = [-180 + 20 * i for i in range(19)]
data = [randint(-10, 10) for _ in range(100)]
counter1, counter2 = {i: 0 for i in range(20)}, {i: 0 for i in range(20)}
for idx, element in enumerate(data):
if (idx + 1) % 20 == 0:
counter2[bisect(occ2, element)] += 1
else:
counter1[bisect(occ1, element)] += 1
bisect ()函数返回位置,其中元素应插入有序数组,如 occ 维持秩序。在occ中有19个值,有20个不同的位置可以插入一个值。也就是说,在第一个之前,在任何元素之间或之后。这相当于你的20个间隔。 唯一要记住的是,e。 G。如果元素小于或大于您的区间的上限或下限,它仍将被分配到最低或最高区间。生成一个尊重区间界限的随机数会阻止这种情况。
根据您的问题,我不确定您是否要累积一些随机数或只检查点列表,其中每20个值执行不同的检查。 该解决方案可以很容易地适应累积随机数,直到达到20次迭代:
from bisect import bisect
from random import uniform
points, value = 100000000, 0
occ1 = [-9 + 1 * i for i in range(19)]
occ2 = [-180 + 20 * i for i in range(19)]
counter1, counter2 = {i: 0 for i in range(20)}, {i: 0 for i in range(20)}
for idx in range(points):
value += uniform(-10, 10)
if (idx + 1) % 20 == 0:
counter2[bisect(occ2, value)] += 1
value = 0
else:
counter1[bisect(occ1, value)] += 1
在我的机器上100M点运行100秒。