我正在将geom_jitter()
用于带有ggplot的箱线图。我注意到它为箱形图顶部的每个记录添加了一个点,而不是仅抖动代表异常值的点。
此代码对此进行了演示。
data <- as.data.frame(c(rnorm(10000, mean = 10, sd = 20), rnorm(300, mean = 90, sd = 5)))
names(data) <- "blapatybloo"
data %>% ggplot(aes("column", blapatybloo)) + geom_boxplot() + geom_jitter(alpha=.1)
如何仅将geom_jitter
应用于箱形图上的点而不重叠其余记录?
答案 0 :(得分:2)
创建一个新列,以确定数据点是否为异常值。然后将这些点叠加到箱线图上。
data <- as.data.frame(c(rnorm(10000, mean = 10, sd = 20), rnorm(300, mean = 90, sd = 5)))
names(data) <- "blapatybloo"
data <- data %>% mutate(outlier = blapatybloo > median(blapatybloo) + IQR(blapatybloo)*1.5 |
blapatybloo < median(blapatybloo) - IQR(blapatybloo)*1.5)
data %>% ggplot(aes("column", blapatybloo)) + geom_boxplot(outlier.shape = NA) +
geom_point(data = function(x) dplyr::filter(x, outlier), position = "jitter")