结合使用谓词函数,对缺失值进行整洁的替换

时间:2019-01-21 10:03:35

标签: r dplyr tidyr missing-data purrr

NA结合使用时,建议的整洁方法是替换tidyr::replace_na() 谓词功能?

我希望以某种方式利用purrr(或类似的预定义的缺失值处理程序),但是我似乎无法使其与dplyr或{{1}一起使用}使用谓词功能的方式。

library(magrittr)

# Example data:
df <- tibble::tibble(
  id = c(rep("A", 3), rep("B", 3)),
  x = c(1, 2, NA, 10, NA, 30),
  y = c("a", NA, "c", NA, NA, "f")
)

# Works, but needs manual spec of columns that should be handled:
df %>% 
  tidyr::replace_na(list(x = 0))  
#> # A tibble: 6 x 3
#>   id        x y    
#>   <chr> <dbl> <chr>
#> 1 A         1 a    
#> 2 A         2 <NA> 
#> 3 A         0 c    
#> 4 B        10 <NA> 
#> 5 B         0 <NA> 
#> 6 B        30 f

# Doesn't work (at least not in the intended way):
df %>% 
  dplyr::mutate_if(
    function(.x) inherits(.x, c("integer", "numeric")),
    ~tidyr::replace_na(0)  
  )
#> # A tibble: 6 x 3
#>   id        x y    
#>   <chr> <dbl> <chr>
#> 1 A         0 a    
#> 2 A         0 <NA> 
#> 3 A         0 c    
#> 4 B         0 <NA> 
#> 5 B         0 <NA> 
#> 6 B         0 f

# Works, but uses an inline def of the replacement function:
df %>% 
  dplyr::mutate_if(
    function(.x) inherits(.x, c("integer", "numeric")),
    function(.x) dplyr::if_else(is.na(.x), 0, .x)
  )
#> # A tibble: 6 x 3
#>   id        x y    
#>   <chr> <dbl> <chr>
#> 1 A         1 a    
#> 2 A         2 <NA> 
#> 3 A         0 c    
#> 4 B        10 <NA> 
#> 5 B         0 <NA> 
#> 6 B        30 f

# Works, but uses an inline def of the replacement function:
df %>% 
  purrr::modify_if(
    function(.x) inherits(.x, c("integer", "numeric")),
    function(.x) dplyr::if_else(is.na(.x), 0, .x)
  )
#> # A tibble: 6 x 3
#>   id        x y    
#>   <chr> <dbl> <chr>
#> 1 A         1 a    
#> 2 A         2 <NA> 
#> 3 A         0 c    
#> 4 B        10 <NA> 
#> 5 B         0 <NA> 
#> 6 B        30 f

由reprex包(v0.2.1)于2019-01-21创建

1 个答案:

答案 0 :(得分:1)

如果我们使用的是~,请同时指定.,即

df %>%
   mutate_if(function(.x) inherits(.x, c("integer", "numeric")), 
           ~ replace_na(., 0))
# A tibble: 6 x 3
#  id        x y    
#  <chr> <dbl> <chr>
#1 A         1 a    
#2 A         2 <NA> 
#3 A         0 c    
#4 B        10 <NA> 
#5 B         0 <NA> 
#6 B        30 f    

否则,就做

df %>% 
  mutate_if(function(.x) inherits(.x, c("integer", "numeric")), 
      replace_na, replace = 0)
# A tibble: 6 x 3
#  id        x y    
#  <chr> <dbl> <chr>
#1 A         1 a    
#2 A         2 <NA> 
#3 A         0 c    
#4 B        10 <NA> 
#5 B         0 <NA> 
#6 B        30 f    

或者另一个变化是

df %>% 
   mutate_if(funs(inherits(., c("integer", "numeric"))), 
              ~ replace_na(., 0))