生成加权重复列表

时间:2018-04-03 11:17:48

标签: python list numpy duplicates

我有几个列表如下:

l1=['InitialRequest','Approved','WorkStarted','OnHold','InProgress','OnHold','InProgress','Completed']

l2=['InitialRequest','Approved','WorkStarted','OnHold','OnHold','OnHold','OnHold','Cancelled']

l3=['InitialRequest','Approved','WorkStarted','InProgress','InProgress','InProgress','InProgress','Completed']

...到l7。我需要维护列表中给出的序列,并生成每个列表15000次。所以我创建了这些列表的列表:

Status=[l1,l2,l3,l4,l5,l6,l7] 

我试过这个:

Status_b = list(np.random.choice(Status, 15000, replace=True, p=[0.1,0.02,0.5,0.08,0.03,0.07,0.1,0.1]))

但是我收到以下错误:

Traceback (most recent call last):

 File "<ipython-input-168-d6488b73dd38>", line 1, in <module>
   Status_b = list(np.random.choice(Status, 15000, replace=True, p=[0.1,0.02,0.5,0.08,0.03,0.07,0.1,0.1]))

 File "mtrand.pyx", line 1117, in mtrand.RandomState.choice

ValueError: a must be 1-dimensional

有人可以给我一个解决方案吗?

1 个答案:

答案 0 :(得分:3)

而不是让random从列表中选择,让它选择索引:

weights = [0.1,0.02,0.5,0.08,0.03,0.07,0.1,0.1]
status_idx = np.random.choice(len(status), 15000, replace=True, p=weights)
status_b  = [Status[i] for i in status_idx)