如何在geom_jitter中更改离群点颜色

时间:2019-12-29 16:09:07

标签: r ggplot2

如何更改与geom_jitter中的异常值相对应的点(颜色,形状等)的参数?

有数据

> dput(head(df, 20))
structure(list(variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L), .Label = c("W1", 
"W3", "W4", "W5", "W12", "W13", "W14"), class = "factor"), value = c(68, 
62, 174, 63, 72, 190, 73, 68, 62, 88, 81, 80, 79, 51, 73, 61, 
NA, NA, 84, 87)), row.names = c(NA, 20L), class = "data.frame")

和一个代码

plot <-
    ggplot(df, aes(factor(df$variable), df$value)) +
    geom_jitter(position = position_jitter(width = .1, height = 0), size = 0.7) +
    theme(legend.position = 'none') +
    theme_classic() +
    labs(x = '',
         y = '',
         title = "test")

我得到了这样的情节。

Plot

对于相同的数据,已经使用默认的coef = 1.5创建了箱形图,因此我知道该数据集中没有异常值。现在,我只想创建点图并将红色的离群点着色。对于geom_boxplot,这是通过单个函数参数outlier.color完成的,但是geom_jitter没有这样的参数。

1 个答案:

答案 0 :(得分:0)

您可以先使用dplyr定义离群值:

library(dplyr)
new_df <- df %>% group_by(variable) %>% filter(!is.na(value)) %>% 
  mutate(Outlier = ifelse(value > quantile(value, 0.75)+1.50*IQR(value),"Outlier","OK")) %>%
  mutate(Outlier = ifelse(value < quantile(value, 0.25)-1.50*IQR(value),"Outlier",Outlier))

head(new_df)
# A tibble: 6 x 3
# Groups:   variable [1]
  variable value Outlier
  <fct>    <dbl> <chr>  
1 W1          68 OK     
2 W1          62 OK     
3 W1         174 Outlier
4 W1          63 OK     
5 W1          72 OK     
6 W1         190 Outlier

然后使用此新列,您可以根据条件Outlier来分配数据集的子集:

library(ggplot2)
ggplot(subset(new_df, Outlier == "OK"), aes(x = variable, y = value))+
  geom_jitter(width = 0.1, size = 0.7)+
  geom_jitter(inherit.aes = FALSE, data = subset(new_df, Outlier == "Outlier"),
              aes(x = variable, y  = value), width = 0.1, size = 3, color = "red")+
  theme(legend.position = 'none') +
  theme_classic() +
  labs(x = '',
       y = '',
       title = "test")

enter image description here