我有一个numpy数组(实际上是从GIS栅格地图导入的),其中包含 物种出现的概率值如下例所示:
a = random.randint(1.0,20.0,1200).reshape(40,30)
b = (a*1.0)/sum(a)
现在我想再次获得该阵列的离散版本。就像我有 例如100个人位于该阵列的区域(1200个细胞)他们是怎么回事 分散式?当然应根据概率分配它们, 意思是较低的值表示较低的发生概率。然而,由于一切都是统计数据,个人仍有可能处于低概率 细胞。多个人应该可以在牢房中占据......
就像将连续分布曲线再次转换为直方图一样。像许多不同的直方图可能会产生一定的分布曲线,它也应该是相反的。因此,应用我正在寻找的算法每次都会产生不同的离散值。
... python中有任何算法可以做到吗?由于我不熟悉离散化,也许有人可以提供帮助。
答案 0 :(得分:3)
将random.choice
与bincount
:
np.bincount(np.random.choice(b.size, 100, p=b.flat),
minlength=b.size).reshape(b.shape)
如果您没有NumPy 1.7,可以将random.choice
替换为:
np.searchsorted(np.cumsum(b), np.random.random(100))
,并提供:
np.bincount(np.searchsorted(np.cumsum(b), np.random.random(100)),
minlength=b.size).reshape(b.shape)
答案 1 :(得分:2)
到目前为止,我认为ecatmur的答案似乎非常合理和简单。
我只想添加一个更“应用”的例子。考虑骰子 有6张脸(6个数字)。每个数字/结果的概率为1/6。 以数组的形式显示骰子可能如下所示:
b = np.array([[1,1,1],[1,1,1]])/6.0
因此,将骰子滚动100次(n=100)
会导致以下模拟:
np.bincount(np.searchsorted(np.cumsum(b), np.random.random(n)),minlength=b.size).reshape(b.shape)
我认为这对于这样的应用来说可能是一种合适的方法。 感谢ecatmur的帮助!
/约翰内斯
答案 2 :(得分:1)
这与我本月早些时候的question类似。
import random
def RandFloats(Size):
Scalar = 1.0
VectorSize = Size
RandomVector = [random.random() for i in range(VectorSize)]
RandomVectorSum = sum(RandomVector)
RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
return RandomVector
from numpy.random import multinomial
import math
def RandIntVec(ListSize, ListSumValue, Distribution='Normal'):
"""
Inputs:
ListSize = the size of the list to return
ListSumValue = The sum of list values
Distribution = can be 'uniform' for uniform distribution, 'normal' for a normal distribution ~ N(0,1) with +/- 5 sigma (default), or a list of size 'ListSize' or 'ListSize - 1' for an empirical (arbitrary) distribution. Probabilities of each of the p different outcomes. These should sum to 1 (however, the last element is always assumed to account for the remaining probability, as long as sum(pvals[:-1]) <= 1).
Output:
A list of random integers of length 'ListSize' whose sum is 'ListSumValue'.
"""
if type(Distribution) == list:
DistributionSize = len(Distribution)
if ListSize == DistributionSize or (ListSize-1) == DistributionSize:
Values = multinomial(ListSumValue,Distribution,size=1)
OutputValue = Values[0]
elif Distribution.lower() == 'uniform': #I do not recommend this!!!! I see that it is not as random (at least on my computer) as I had hoped
UniformDistro = [1/ListSize for i in range(ListSize)]
Values = multinomial(ListSumValue,UniformDistro,size=1)
OutputValue = Values[0]
elif Distribution.lower() == 'normal':
"""
Normal Distribution Construction....It's very flexible and hideous
Assume a +-3 sigma range. Warning, this may or may not be a suitable range for your implementation!
If one wishes to explore a different range, then changes the LowSigma and HighSigma values
"""
LowSigma = -3#-3 sigma
HighSigma = 3#+3 sigma
StepSize = 1/(float(ListSize) - 1)
ZValues = [(LowSigma * (1-i*StepSize) +(i*StepSize)*HighSigma) for i in range(int(ListSize))]
#Construction parameters for N(Mean,Variance) - Default is N(0,1)
Mean = 0
Var = 1
#NormalDistro= [self.NormalDistributionFunction(Mean, Var, x) for x in ZValues]
NormalDistro= list()
for i in range(len(ZValues)):
if i==0:
ERFCVAL = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
NormalDistro.append(ERFCVAL)
elif i == len(ZValues) - 1:
ERFCVAL = NormalDistro[0]
NormalDistro.append(ERFCVAL)
else:
ERFCVAL1 = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
ERFCVAL2 = 0.5 * math.erfc(-ZValues[i-1]/math.sqrt(2))
ERFCVAL = ERFCVAL1 - ERFCVAL2
NormalDistro.append(ERFCVAL)
#print "Normal Distribution sum = %f"%sum(NormalDistro)
Values = multinomial(ListSumValue,NormalDistro,size=1)
OutputValue = Values[0]
else:
raise ValueError ('Cannot create desired vector')
return OutputValue
else:
raise ValueError ('Cannot create desired vector')
return OutputValue
ProbabilityDistibution = RandFloats(1200)#This is your probability distribution for your 1200 cell array
SizeDistribution = RandIntVec(1200,100,Distribution=ProbabilityDistribution)#for a 1200 cell array, whose sum is 100 with given probability distribution
重要的两条主线是上面代码中的最后两行