我试着理解为什么qqline
如果({仅限于)log="xy"
th <- 0.5
n <- 1e3
set.seed(271)
x <- rgamma(n, shape=th)
qF <- function(p) qgamma(p, shape=th)
## 1) Q-Q plot in normal space
plot(qF(ppoints(n)), sort(x))
qqline(y=sort(x), distribution=qF) # fine
## 2) Q-Q plot in log10-x space
plot(qF(ppoints(n)), sort(x), log="x")
qqline(y=sort(x), distribution=qF, untf=TRUE) # with untf=TRUE, fine
## 3) Q-Q plot in log10-y space
plot(qF(ppoints(n)), sort(x), log="y")
qqline(y=sort(x), distribution=qF, untf=TRUE) # with untf=TRUE, fine
没有给我我的期望:
?abline
现在untf
地址## 4) Q-Q plot in log10-log10 space
plot(qF(ppoints(n)), sort(x), log="xy")
qqline(y=sort(x), distribution=qF, untf=TRUE) # ... but this is wrong
qqline(y=sort(x), distribution=qF, col="red") # ... (still) wrong
qqline(y=sort(log10(x)), distribution=function(p) log10(qF(p)), col="blue") # seems to be correct (but draws a line, not the curve resulting from transforming the Q-Q line in 'non-log-log' space)
。根据这一点,以下内容也应该起作用(在我看来):
untf=TRUE
我的问题是:1)为什么log="xy"
没有为log="x"
提供正确的答案,尽管它对log="y"
和qqline()
有效? 2)在更一般的基础上,什么是对数空间中的“正确”Q-Q线(无论如何)?
请注意,我已向R-help发布了类似问题,但未收到答案。
更新
我做了更多的实验,(通过脚)再现了日志空间中log10()
产生的奇怪曲线。我还认为log()
而不是th <- 0.5
n <- 1e2 # note the difference for n=1e2 vs n=1e4!
set.seed(271)
x <- rgamma(n, shape=th)
qF <- function(p) qgamma(p, shape=th)
## Q-Q plot
qq.x <- qF(ppoints(n))
qq.y <- sort(x)
plot(qq.x, qq.y, log="xy", main=substitute("Q-Q plot with sample size n="*n., list(n.=n)),
xlab="Theoretical quantiles", ylab="Sample quantiles")
abline(v=quantile(qq.x, probs=0.25), col="gray50") # vertical line through 25%-quantile
abline(v=quantile(qq.x, probs=0.75), col="gray50") # vertical line through 75%-quantile
qqline(y=qq.y, distribution=qF, untf=TRUE) # ... this is wrong (*)
qqline(y=qq.y, distribution=qF, col="red") # ... still wrong
## => doesn't pass first and third quartile, but closer to 'right' line (blue) for large n
qqline(y=sort(log10(x)), distribution=function(p) log10(qF(p)), col="blue")
## => seems to be correct (but draws a line, not the curve resulting from transforming
## the Q-Q line in 'non-log-log' space)
## by foot (as will be seen, this reproduces (*))
probs <- c(0.25, 0.75)
q.x <- qF(probs)
q.y <- quantile(x, probs=probs, names=FALSE)
slope <- diff(q.y)/diff(q.x)
int <- q.y[1L] - slope * q.x[1L]
f <- function(x) slope * x + int # line in *normal* space
qq.y. <- f(qq.x) # points of the Q-Q line evaluated in normal space
lines(qq.x, qq.y., col="darkgreen")
## => reproduces qqline(.., untf=TRUE), see (*) above, that is,
## standard Q-Q line added to a plot in log-log scale = wrong (although points
## of the Q-Q plot are constructed this way!)
## legend
legend("topleft", lty=rep(1,5), col=c("darkgreen", "black", "red", "blue", "gray50"),
legend=c("Standard Q-Q line with 'untf=TRUE'",
"Standard Q-Q line plotted in log-log scale",
"Standard Q-Q line",
"Q-Q line of log10(x) vs log10-quantiles", "1st and 3rd quartiles"))
可能是错误行的原因。所以蓝线确实似乎是正确的如果想要在日志空间中有行。我不清楚的是为什么以对数 - 对数标度显示的标准Q-Q线的点不能给出正确的曲线,尽管Q-Q曲线的点也是这样构造的。
{{1}}