Question

This问题显示如何在ggplot2中使用qqline制作qqplot，但只有在单个图形中绘制整个数据集时，答案似乎才有效。

我想要一种方法来快速比较这些数据子集的图。也就是说，我想在带有facet的图形上使用qqlines创建qqplots。因此，在下面的示例中，将有所有9个图的线，每个图都有自己的截距和斜率。

df1 = data.frame(x = rnorm(1000, 10),
                 y = sample(LETTERS[1:3], 100, replace = TRUE),
                 z = sample(letters[1:3], 100, replace = TRUE))

ggplot(df1, aes(sample = x)) +
  stat_qq() +
  facet_grid(y ~ z)

facet data

Answer 1

你可以试试这个：

library(plyr)

# create some data
set.seed(123)
df1 <- data.frame(vals = rnorm(1000, 10),
                  y = sample(LETTERS[1:3], 1000, replace = TRUE),
                  z = sample(letters[1:3], 1000, replace = TRUE))

# calculate the normal theoretical quantiles per group
df2 <- ddply(.data = df1, .variables = .(y, z), function(dat){
             q <- qqnorm(dat$vals, plot = FALSE)
             dat$xq <- q$x
             dat
}
)

# plot the sample values against the theoretical quantiles
ggplot(data = df2, aes(x = xq, y = vals)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  xlab("Theoretical") +
  ylab("Sample") +
  facet_grid(y ~ z)

enter image description here

Answer 2

没有充分的理由，这里的dplyr（在这个问题发生时并不存在）同一件事的版本。为了进行同行评审和比较，我将提供生成数据集的代码，以便您可以进一步检查它们。

# create some data
set.seed(123)
df1 <- data.frame(vals = rnorm(10, 10),
                  y = sample(LETTERS[1:3], 1000, replace = TRUE),
                  z = sample(letters[1:3], 1000, replace = TRUE))

#* Henrik's plyr version
library(plyr)
df2 <- plyr::ddply(.data = df1, .variables = .(y, z), function(dat){
             q <- qqnorm(dat$vals, plot = FALSE)
             dat$xq <- q$x
             dat
}
)

detach("package:plyr")


#* The dplyr version
library(dplyr)
qqnorm_data <- function(x){
  Q <- as.data.frame(qqnorm(x, plot = FALSE))
  names(Q) <- c("xq", substitute(x))
  Q
}

df3 <- df1 %>%
  group_by(y, z) %>%
      do(with(., qqnorm_data(vals)))

可以使用Henrik的相同代码完成绘图。

ggplot2中的qqline with facets

2 个答案: