Question

我已经探讨了有关此主题的类似问题，但我在直方图上产生一个很好的曲线时遇到了一些麻烦。我知道有些人可能会认为这是重复的，但我目前还没有发现任何有助于解决问题的方法。

虽然这里的数据不可见，但这里有一些我正在使用的变量，因此您可以在下面的代码中看到它们代表的内容。

Differences <- subset(Score_Differences, select = Difference, drop = T)
m = mean(Differences)
std = sqrt(var(Differences))

这是我生成的第一条曲线（代码似乎最常见且易于生成，但曲线本身并不适合）。

hist(Differences, density = 15, breaks = 15, probability = TRUE, xlab = "Score Differences", ylim = c(0,.1), main = "Normal Curve for Score Differences")
curve(dnorm(x,m,std),col = "Red", lwd = 2, add = TRUE)

我真的很喜欢这个，但不喜欢曲线进入负面区域。

hist(Differences, probability = TRUE)
lines(density(Differences), col = "Red", lwd = 2)
lines(density(Differences, adjust = 2), lwd = 2, col = "Blue")

这与第一个直方图相同，但有频率。仍然看起来不太好。

h = hist(Differences, density = 15, breaks = 15, xlab = "Score Differences", main = "Normal Curve for Score Differences")
xfit = seq(min(Differences),max(Differences))
yfit = dnorm(xfit,m,std)
yfit = yfit*diff(h$mids[1:2])*length(Differences)
lines(xfit, yfit, col = "Red", lwd = 2)

另一次尝试但没有运气。也许是因为我使用的是qnorm，当数据显然不正常时。曲线再次进入负方向。

sample_x = seq(qnorm(.001, m, std), qnorm(.999, m, std), length.out = l)
binwidth = 3
breaks = seq(floor(min(Differences)), ceiling(max(Differences)), binwidth)
hist(Differences, breaks)
lines(sample_x, l*dnorm(sample_x, m, std)*binwidth, col = "Red")

视觉上看起来不错的唯一曲线是第2条曲线，但曲线则是负向。

我的问题是＆＃34;是否有＆＃34;标准方式＆＃34;在直方图上放置曲线？＆＃34; 这些数据肯定不正常。我在这里介绍的3个程序来自类似的帖子，但我显然遇到了一些麻烦。我觉得所有拟合曲线的方法都取决于您正在使用的数据。

使用解决方案更新

感谢李哲元等人！我将把它留给我自己参考，并希望其他人也可以。

hist(Differences, probability = TRUE)
lines(density(Differences, cut = 0), col = "Red", lwd = 2)
lines(density(Differences, adjust = 2, cut = 0), lwd = 2, col = "Blue")

Answer 1

好的，所以你只是在努力解决density超越＆＃34;自然范围＆＃34;的事实。好吧，只需设置cut = 0即可。您可能想要阅读plot.density extends “xlim” beyond the range of my data. Why and how to fix it?了解原因。在那个答案中，我使用的是from和to。但现在我正在使用cut。

## consider a mixture, that does not follow any parametric distribution family
## note, by construction, this is a strictly positive random variable
set.seed(0)
x <- rbeta(1000, 3, 5) + rexp(1000, 0.5)

## (kernel) density estimation offers a flexible nonparametric approach
d <- density(x, cut = 0)

## you can plot histogram and density on the density scale
hist(x, prob = TRUE, breaks = 50)
lines(d, col = 2)

注意，cut = 0，密度估算严格按range(x)进行。超出此范围，密度为0.

如何将曲线拟合到直方图

1 个答案: