我有一个类似于大量经济数据的数据集:对数正态但尾部较长。 。 。或者至少它在qqplot中看起来那样
Temple.df <- read.delim("http://history.emory.edu/RAVINA/Stackoverflow/Temple.txt", header = TRUE, sep = "\t")
##generate theoretical quartiles
vec <- log10(Temple.df$total_land)
y <- quantile(vec[!is.na(vec)], c(0.25, 0.75))
x <- qnorm(c(0.25, 0.75))
slope <- diff(y)/diff(x)
int <- y[1L] - slope * x[1L]
tags <- c(-1,0,1,2,3,4)
library(ggplot2)
ggplot(data=Temple.df, aes(sample=vec)) +
geom_qq(geom = "point", distribution = stats::qnorm) +
geom_abline(intercept = int ,slope = slope) +
scale_y_continuous(breaks=tags, labels=10^tags)
但是当我尝试在直方图上叠加曲线时,曲线似乎奇怪地向左移动。
ggplot(Temple.df, aes(total_land)) +
geom_histogram(color="white", aes(y = ..density..)) +
scale_x_continuous(trans = "log") +
stat_function(fun=dlnorm,
args=list(meanlog = mean(log(Temple.df$total_land)),
sdlog = sd(log(Temple.df$total_land))))
我的代码中是否有错误?为什么密度曲线不适合直方图?什么是我明显的STAT101错误?