我正在寻找一种从大型Python集中随机抽取单个元素的更快方法。下面我对三个明显的例子进行了基准测试。有没有更快的方法呢?
import random
import time
test_set = set(["".join(["elem-", str(l)]) for l in range(0, 1000000)])
t0 = time.time()
random_element = random.choice(list(test_set))
print(time.time() - t0)
t0 = time.time()
random_element = random.sample(test_set, 1)
print(time.time() - t0)
t0 = time.time()
rand_idx = random.randrange(0, len(test_set)-1)
random_element = list(test_set)[rand_idx]
print(time.time() - t0)
输出:
0.0692291259765625
0.06741929054260254
0.07094502449035645
答案 0 :(得分:-1)
您可以使用numpy
并将其添加到基准测试中。
import numpy
random_num = numpy.randit(0, 1000000)
element = 'elem-' + str(random_num)
test_array = numpy.array([x for x in test_set])
具体来说,这是一段对不同方法进行基准测试的代码:
random_choice_times = []
random_sample_times = []
random_randrange_times = []
numpy_choince_times = []
for i in range(0,10):
t0 = time.time()
random_element = random.choice(list(test_set))
time_elps = time.time() - t0
random_choice_times.append(time_elps)
t0 = time.time()
random_element = random.sample(test_set, 1)
time_elps = time.time() - t0
random_sample_times.append(time_elps)
t0 = time.time()
rand_idx = random.randrange(0, len(test_set)-1)
random_element = list(test_set)[rand_idx]
time_elps = time.time() - t0
random_randrange_times.append(time_elps)
t0 = time.time()
random_num = numpy.random.choice(numpy.array(test_array))
time_elps = time.time() - t0
numpy_choince_times.append(time_elps)
print("Avg time for random.choice: ", sum(random_choice_times) /10)
print("Avg time for random.sample: ", sum(random_sample_times) /10)
print("Avg time for random.randrange: ", sum(random_randrange_times) /10)
print("Avg time for numpy.choice: ", sum(numpy_choince_times) /10)
以下是时间
>>> Avg time for random.choice: 0.06497154235839844
>>> Avg time for random.sample: 0.06054067611694336
>>> Avg time for random.randrange: 0.05938301086425781
>>> Avg time for numpy.choice: 0.017636775970458984