Student - np.random.choice:如何在np.random.choice范围内隔离和计算频率

时间:2016-08-01 19:36:37

标签: python pandas numpy random range

目前正在学习Python并且对Numpy&熊猫

我拼凑了一个带范围的随机发生器。它使用Numpy,我无法隔离每个单独的结果来计算我随机范围内的范围内的迭代。

目标:计算"随机> = 1000"的迭代次数。然后将1添加到与迭代计数相关的适当单元格中。非常基本的例子:

#Random generator begins... these are first four random generations
Randomiteration0 = 175994 (Random >= 1000)
Randomiteration1 = 1199 (Random >= 1000)
Randomiteration2 = 873399 (Random >= 1000)
Randomiteration3 = 322 (Random < 1000)

#used to +1 to the fourth row of column A in CSV
finalIterationTally = 4

#total times random < 1000 throughout entire session. Placed in cell B1
hits = 1
#Rinse and repeat to custom set generations quantity...

(然后逻辑将是电子表格中的+1到A4。如果迭代计数为7,那么+1到A7等等。所以基本上,我测量距离和频率之间的距离每个&#34;命中&#34;)

我当前的代码包含CSV导出部分。我不需要再导出每个随机结果了。我只需要输出每次命中之间每个迭代距离的频率。这是我难倒的地方。

干杯

import pandas as pd
import numpy as np

#set random generation quantity
generations=int(input("How many generations?\n###:"))

#random range and generator
choices = range(1, 100000)
samples = np.random.choice(choices, size=generations)

#create new column in excel
my_break = 1000000
if generations > my_break:
    n_empty = my_break - generations % my_break
    samples = np.append(samples, [np.nan] * n_empty).reshape((-1, my_break)).T

#export results to CSV
(pd.DataFrame(samples)
 .to_csv('eval_test.csv', index=False, header=False))

#left uncommented if wanting to test 10 generations or so
print (samples)

1 个答案:

答案 0 :(得分:0)

我相信你正在混合迭代和世代。听起来你想要为N代代数进行4次迭代,但是你的底层代码并没有表达&#34; 4&#34;任何地方。如果您将所有变量拉到脚本顶部,它可以帮助您更好地组织。 Panda非常适合解析复杂的csvs,但对于这种情况,你并不需要它。你可能甚至不需要numpy。

import numpy as np

THRESHOLD = 1000
CHOICES = 10000
ITERATIONS = 4
GENERATIONS = 100

choices = range(1, CHOICES)

output = np.zeros(ITERATIONS+1)

for _ in range(GENERATIONS):
  samples = np.random.choice(choices, size=ITERATIONS)
  count = sum([1 for x in samples if x > THRESHOLD])
  output[count] += 1

output = map(str, map(int, output.tolist()))

with open('eval_test.csv', 'w') as f:
  f.write(",".join(output)+'\n')