Question

我的变量x为2700点。这是我原来的数据。

我的数据的 HISTOGRAM 如下所示。青色线是我的数据遵循的分布。我在我的直方图中使用了curve_fit并获得了拟合曲线。拟合曲线是一个100000点的numpy阵列。

enter image description here

我希望生成一个平滑的random data，比如100000点，跟随original data的 DISTRIBUTION 。 i.e in principle I want 100000 points below the fitted curve, starting from 0.0 and increasing in the same way as the curve till 0.5

到目前为止，我已经尝试过在曲线下方得到100000点：

我使用np.random.uniform(0,0.5,100000)

生成了统一的随机数

random_x = []

u = np.random.uniform(0,0.5,100000)

for i in u:
    if i<=y_ran:  # here y_ran is the numpy array of the fitted curve
        random_x.append(i)

但是我得到一个错误`ValueError：具有多个元素的数组的真值是不明确的。使用a.any（）或a.all（）

我知道上面的代码不正确，但我该如何进一步？ `

Answer 1

好的，所以y_ran是一个定义曲线的值列表。如果我理解正确，您需要一个落在曲线下方的随机数据集。一种方法是从曲线点开始，并将每个曲线点减少一些;例如，你可以让每个新点等于原始值的80％-100％范围内。

variation = np.random.uniform(low=.8, high=1.0, size=len(y_ran))
newData = y_ran * variation

这会让你在某个地方开始吗？

Answer 2

我会通过以下方式解决问题：首先，将您的y_ran拟合曲线拟合为高斯曲线（请参阅例如this问题），然后使用已知的正态分布绘制样本系数按np.random.normal函数。沿着这些方向的东西将起作用（部分取自我所指的问题的答案）：

import numpy
from scipy.optimize import curve_fit    

# Define model function to be used to fit to the data above:
def gauss(x, *p):
    A, mu, sigma = p
    return A*numpy.exp(-(x-mu)**2/(2.*sigma**2))

# p0 is the initial guess for the fitting coefficients (A, mu and sigma above)
p0 = [1., 0., 1.]

coeff, var_matrix = curve_fit(gauss, x, y_ran, p0=p0)

sample = numpy.random.normal(*coeff, (100000,))

注意：1。未经测试，2。您的拟合曲线需要x值。

创建遵循特定数据分布的数据

2 个答案: