geom_boxplot + scale_y_log10错误

时间:2018-07-05 19:51:22

标签: r ggplot2 scale boxplot logarithm

我正在尝试使用geom_boxplot和scale_y_log10按组创建一些对数正态数据的箱线图。当我用中位数和其他四分位数的实际值检查这些图时,我发现某些箱线图绘制错误,即中位数和/或其他四分位数不在正确的位置。经过大量测试,我意识到同时使用geom_boxplot和scale_y_log10时,具有两个值的组只会给出错误的图。

以下是测试示例的代码:

f1 <- function(x) {
log10(mean(10 ^ x))
}

library(ggplot2)

test1 <- data.frame("value" = c(3, 45, 2, 100),
               "field" = c("a", "a", "a", "a"),
               "group" = c("1", "1", "2", "2"))

test2 <- data.frame("value" = c(3, 10, 45, 2, 70, 100),
               "field" = c("a", "a", "a", "a", "a", "a"),
               "group" = c("1", "1", "1", "2", "2", "2"))

tapply(test1$value, test1$group, median, na.rm = TRUE)
tapply(test1$value, test1$group, mean, na.rm = TRUE)

ggplot(test1, aes(x=field, y=value)) + geom_boxplot() + facet_grid(~ group) +
geom_hline(yintercept = c(24, 51), colour="blue", linetype=2) +
geom_hline(yintercept = c(24, 51), colour="red", linetype=2) +
scale_y_log10() +
stat_summary(fun.y=f1, geom="point", shape=1, size=3, color="red", 
fill="red") +
theme(legend.position="none") +
scale_fill_brewer(palette="Set3")

tapply(test2$value, test2$group, median, na.rm = TRUE)
tapply(test2$value, test2$group, mean, na.rm = TRUE)

ggplot(test2, aes(x=field, y=value)) + geom_boxplot() + facet_grid(~ group)+
geom_hline(yintercept = c(10, 70), colour="blue", linetype=2) + 
geom_hline(yintercept = c(19.3, 57.3), colour="red", linetype=2) +
scale_y_log10() +
stat_summary(fun.y=f1, geom="point", shape=1, size=3, color="red", 
fill="red") +
theme(legend.position="none") +
scale_fill_brewer(palette="Set3")

如您所见,如果您运行上述示例,则test1(每组两个值)会给出具有错误中位数的箱形图,而test2(每组三个值)会给出正确的箱形图。

有任何想法为什么会这样?

0 个答案:

没有答案