我试图在箱形图胡须的末端放置观察点的标签,但是当有异常值时它似乎不起作用。
我试图将最大/最小值与我认为计算的晶须长度[四分位数1(或四分位数3)+(或 - )1.5 *四分位数范围]进行比较。但标签不会放在最大/最小值或晶须末端。
使用mtcars
和y轴反转的示例演示:
library(ggplot2,dplyr)
mtcars %>%
select(qsec, cyl,am) %>%
ggplot(aes(factor(cyl),qsec,fill=factor(am))) +
stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
geom_boxplot(outlier.shape=1, outlier.size=3,
position = position_dodge(width = 0.75)) +
scale_y_reverse() +
geom_text(data = mtcars %>%
select(qsec,cyl,am) %>%
group_by(cyl, am) %>%
summarize(min_qsec = min(qsec),Count = n(),med = median(qsec),
q1 = quantile(qsec,0.25),
q3 = quantile(qsec,0.75), iqr = IQR(qsec),
qsec = mean(qsec),
lab_pos = max(min_qsec, q1-1.5*iqr)),
aes(y=lab_pos,label = Count), position = position_dodge(width = 0.75))
产生:
am(1)
的{{1}}和cyl(4)
的{{1}}标签未对齐。
我对am(0)
的计算是否不正确,或者是否有更好的方法在晶须端定位标签,而不管异常值?我想使用cyl(8)
和lab_pos
完成此操作,如果可能的话
答案 0 :(得分:1)
如果我理解正确,这就是你想要的:
label_data <- mtcars %>%
select(qsec, cyl, am) %>%
group_by(cyl, am) %>%
summarize(min_qsec = min(qsec),
Count = n(),
med = median(qsec),
q1 = quantile(qsec, 0.25),
q3 = quantile(qsec, 0.75),
iqr = IQR(qsec),
lab_pos = min(ifelse(qsec > q1-1.5*iqr, qsec, NA), na.rm = TRUE),
qsec = mean(qsec))
mtcars %>%
select(qsec, cyl,am) %>%
ggplot(aes(factor(cyl),qsec,fill=factor(am))) +
stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
geom_boxplot(outlier.shape=1, outlier.size=3,
position = position_dodge(width = 0.75)) +
scale_y_reverse() +
geom_text(data = label_data, aes(y = lab_pos,label = Count),
position = position_dodge(width = 0.75), vjust = 0, fontface = "bold")
胡须延伸到栅栏内的最远点,而不是栅栏本身。