将NA添加为整个数据框的因子级别

时间:2018-08-21 18:11:22

标签: r dataframe

我有一个数据集,其中我的一个表中的列完全是因素。它们中唯一的数据是'是'或NA值。每一列只有一个因子水平,即是。我也想将NA设为要素水平。不幸的是,我对addNA()函数的理解很差。请有人帮我以更简洁的方式将NA作为因子级别添加到整个数据集中,而不是我不得不为每一列单独键入。谢谢

xl<- structure(list(G = structure(c(1L, NA, NA, NA, NA, 
NA, NA, NA, NA, NA), .Label = "yes", class = "factor"), A = structure(c(1L, 1L, NA, NA, NA, 1L, NA, NA, 1L, 1L), .Label = "yes", class = "factor"), 
L = structure(c(2L, 2L, NA, NA, 2L, 2L, 2L, 
NA, 2L, 2L), .Label = c("no", "yes"), class = "factor"), 
P = structure(c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), .Label = "yes", class = "factor"), 
C = structure(c(NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_), .Label = "yes", class = "factor"), 
S = structure(c(NA, NA, NA, NA, NA, NA, 1L, NA, NA, 
NA), .Label = "yes", class = "factor"), M = structure(c(NA, 
NA, 1L, NA, NA, NA, 1L, NA, NA, NA), .Label = "yes", class = "factor"), 
F = structure(c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), .Label = "yes", class = "factor")), .Names =    c("G", "A", "L", "P", "C", "S", "M", "F"), row.names = c("row_1", "row_2", "row_3", "row_4", "row_5", "row_6", "row_7", "row_8", "row_10", "row_11"), class = "data.frame")
xl <- addNA(xl)

2 个答案:

答案 0 :(得分:1)

purrr来拯救您:

library(tidyverse)

xl_new <- xl %>% 
  map_df(factor, levels = c("yes", "NA"))

或另外使用forcats

xl_new <- xl %>% 
  map_df(fct_explicit_na, "NA")

答案 1 :(得分:1)

我真的很喜欢@FMM对forcats::fct_explicit_na的使用,您可以使用dplyr::mutate_all,因为这些列都是因素。如果您具有不同类型的列,但只想针对因子列进行此操作,则可以使用dplyr::mutate_ifis.factor作为谓词。

library(tidyverse)

xl %>%
  mutate_all(fct_explicit_na, "NA")
#>      G   A   L  P  C   S   M  F
#> 1  yes yes yes NA NA  NA  NA NA
#> 2   NA yes yes NA NA  NA  NA NA
#> 3   NA  NA  NA NA NA  NA yes NA
#> 4   NA  NA  NA NA NA  NA  NA NA
#> 5   NA  NA yes NA NA  NA  NA NA
#> 6   NA yes yes NA NA  NA  NA NA
#> 7   NA  NA yes NA NA yes yes NA
#> 8   NA  NA  NA NA NA  NA  NA NA
#> 9   NA yes yes NA NA  NA  NA NA
#> 10  NA yes yes NA NA  NA  NA NA

xl %>%
  mutate_all(fct_explicit_na, "NA") %>%
  str()
#> 'data.frame':    10 obs. of  8 variables:
#>  $ G: Factor w/ 2 levels "yes","NA": 1 2 2 2 2 2 2 2 2 2
#>  $ A: Factor w/ 2 levels "yes","NA": 1 1 2 2 2 1 2 2 1 1
#>  $ L: Factor w/ 3 levels "no","yes","NA": 2 2 3 3 2 2 2 3 2 2
#>  $ P: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
#>  $ C: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
#>  $ S: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 1 2 2 2
#>  $ M: Factor w/ 2 levels "yes","NA": 2 2 1 2 2 2 1 2 2 2
#>  $ F: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2