我编造了这个例子来解释我的问题:
df= structure(list(group = structure(c(1L, 1L, 2L, 2L, 10L, 10L
), .Label = c("Eve", "ba", "De", "De","Mi", "C", "O", "W",
"as", "ras", "Cro", "ics"), class = "factor"), ds = c(8, 8,
1, 4, 4, 6), em = c(1, 3, 8,2, 7, 3)), row.names = c(74567L,
74568L, 74570L, 74576L, 74577L, 74578L), class = "data.frame")
我需要为每个组将em和ds的所有值分配给NA
> quantile 90 = NA
< quantile 10 = NA
答案 0 :(得分:0)
这是使用dplyr和ifelse
对每个组和每个数字变量执行此操作的方法。
每组只有几个样本,很难解释分位数的整个概念,因此,获得的结果很大程度上取决于定义分位数的方式。使用type
参数可以指定所使用的定义。 R默认为type = 7
:
library(dplyr)
df %>%
group_by(group) %>%
mutate(ds = ifelse(ds > quantile(ds, .9) | ds < quantile(ds, .1), NA, ds),
em = ifelse(em > quantile(em, .9) | em < quantile(em, .1), NA, em))
#> # A tibble: 6 x 3
#> # Groups: group [3]
#> group ds em
#> <fct> <dbl> <lgl>
#> 1 Eve 8 NA
#> 2 Eve 8 NA
#> 3 ba NA NA
#> 4 ba NA NA
#> 5 ras NA NA
#> 6 ras NA NA
但是,您可以根据定义进行更改:
df %>%
group_by(group) %>%
mutate(ds = ifelse(ds > quantile(ds, .9, type = 1) |
ds < quantile(ds, .1, type = 1), NA, ds),
em = ifelse(em > quantile(em, .9, type = 1) |
em < quantile(em, .1, type = 1), NA, em))
#> # A tibble: 6 x 3
#> # Groups: group [3]
#> group ds em
#> <fct> <dbl> <dbl>
#> 1 Eve 8 1
#> 2 Eve 8 3
#> 3 ba 1 8
#> 4 ba 4 2
#> 5 ras 4 7
#> 6 ras 6 3
由reprex package(v0.3.0)于2020-05-17创建