Question

我想知道如何有效地（使用短R代码）填充曲线，其中的点可以填满我曲线下的区域？

我尝试了一些没有成功的事情，这是我的R代码：

data = rnorm(1000)     ## random data points to fill the curve

curve(dnorm(x), -4, 4) ## curve to be filled by "data" above

points(data)           ## plotting the points to fill the curve

Answer 1

这是一种使用插值来确保绘制点不会超过曲线高度的方法（但是，如果您希望实际点标记不会超出曲线，则需要设置阈值略低于曲线的高度）：

# Curve to be filled
c.pts = as.data.frame(curve(dnorm(x), -4, 4)) 

# Generate 1000 random points in the same x-interval and with y value between
# zero and the maximum y-value of the curve
set.seed(2)
pts = data.frame(x=runif(1000,-4,4), y=runif(1000,0,max(c.pts$y)))

# Using interpolation, keep only those points whose y-value is less than y(x)
pts = pts[pts$y < approx(c.pts$x,c.pts$y,xout=pts$x)$y, ]

# Plot the points
points(pts, pch=16, col="red", cex=0.7)

用于在曲线下精确绘制所需数量的点的方法

回应@ d.b的评论，这是一种在曲线下绘制精确所需点数的方法：

首先，让我们弄清楚我们需要在整个绘图区域生成多少个随机点，以便（大致）获得曲线下的目标点数。我们这样做如下：

计算曲线下的面积，作为由零限定的矩形区域的一部分，以及垂直轴上曲线的最大高度，以及水平轴上曲线的宽度。

我们需要生成的随机点数是目标点数除以上面计算的面积比。

# Area ratio
aa = sum(c.pts$y*median(diff(c.pts$x)))/(diff(c(-4,4))*max(c.pts$y))

# Target number of points under curve
n.target = 1000

# Number of random points to generate
n = ceiling(n.target/aa)

但是我们需要更多的积分来确保我们得到至少n.target，因为一旦我们将绘制的点数限制为低于{1}}，随机变化将导致少于n.target点的一半时间曲线。因此，为了在曲线下生成比我们需要的更多点，我们将添加excess.factor，然后我们将随机选择这些点的n.target进行绘制。这是一个功能，负责一般曲线的整个过程。

# Plot a specified number of points under a curve
pts.under.curve = function(data, n.target=1000, excess.factor=1.5) {

  # Area under curve as fraction of area of plot region
  aa = sum(data$y*median(diff(data$x)))/(diff(range(data$x))*max(data$y))

  # Number of random points to generate
  n = excess.factor*ceiling(n.target/aa)

  # Generate n random points in x-range of the data and with y value between
  # zero and the maximum y-value of the curve
  pts = data.frame(x=runif(n,min(data$x),max(data$x)), y=runif(n,0,max(data$y)))

  # Using interpolation, keep only those points whose y-value is less than y(x)
  pts = pts[pts$y < approx(data$x,data$y,xout=pts$x)$y, ]

  # Randomly select only n.target points
  pts = pts[sample(1:nrow(pts), n.target), ]

  # Plot the points
  points(pts, pch=16, col="red", cex=0.7)

}

让我们运行原始曲线的函数：

c.pts = as.data.frame(curve(dnorm(x), -4, 4)) 

pts.under.curve(c.pts)

现在让我们用不同的发行版来测试它：

# Curve to be filled
c.pts = as.data.frame(curve(df(x, df1=100, df2=20),0,5,n=1001)) 

pts.under.curve(c.pts, n.target=200)

Answer 2

n_points = 10000 #A large number

#Store curve in a variable and plot
cc = curve(dnorm(x), -4, 4, n = n_points)

#Generate 1000 random points
p = data.frame(x = seq(-4,4,length.out = n_points), y = rnorm(n = n_points))
#OR p = data.frame(x = runif(n_points,-4,4), y = rnorm(n = n_points))

#Find out the index of values in cc$x closest to p$x
p$ind = findInterval(p$x, cc$x)

#Only retain those points within the curve whose p$y are smaller than cc$y
p2 = p[p$y >= 0 & p$y < cc$y[p$ind],] #may need p[p$y < 0.90 * cc$y[p$ind],] or something

#Plot points
points(p2$x, p2$y)

使用适合R图中曲线下方的点填充曲线

2 个答案:

用于在曲线下精确绘制所需数量的点的方法