我有一个数据集,其中我的一个表中的列完全是因素。它们中唯一的数据是'是'或NA值。每一列只有一个因子水平,即是。我也想将NA设为要素水平。不幸的是,我对addNA()函数的理解很差。请有人帮我以更简洁的方式将NA作为因子级别添加到整个数据集中,而不是我不得不为每一列单独键入。谢谢
xl<- structure(list(G = structure(c(1L, NA, NA, NA, NA,
NA, NA, NA, NA, NA), .Label = "yes", class = "factor"), A = structure(c(1L, 1L, NA, NA, NA, 1L, NA, NA, 1L, 1L), .Label = "yes", class = "factor"),
L = structure(c(2L, 2L, NA, NA, 2L, 2L, 2L,
NA, 2L, 2L), .Label = c("no", "yes"), class = "factor"),
P = structure(c(NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_), .Label = "yes", class = "factor"),
C = structure(c(NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_), .Label = "yes", class = "factor"),
S = structure(c(NA, NA, NA, NA, NA, NA, 1L, NA, NA,
NA), .Label = "yes", class = "factor"), M = structure(c(NA,
NA, 1L, NA, NA, NA, 1L, NA, NA, NA), .Label = "yes", class = "factor"),
F = structure(c(NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_), .Label = "yes", class = "factor")), .Names = c("G", "A", "L", "P", "C", "S", "M", "F"), row.names = c("row_1", "row_2", "row_3", "row_4", "row_5", "row_6", "row_7", "row_8", "row_10", "row_11"), class = "data.frame")
xl <- addNA(xl)
答案 0 :(得分:1)
purrr
来拯救您:
library(tidyverse)
xl_new <- xl %>%
map_df(factor, levels = c("yes", "NA"))
或另外使用forcats
:
xl_new <- xl %>%
map_df(fct_explicit_na, "NA")
答案 1 :(得分:1)
我真的很喜欢@FMM对forcats::fct_explicit_na
的使用,您可以使用dplyr::mutate_all
,因为这些列都是因素。如果您具有不同类型的列,但只想针对因子列进行此操作,则可以使用dplyr::mutate_if
和is.factor
作为谓词。
library(tidyverse)
xl %>%
mutate_all(fct_explicit_na, "NA")
#> G A L P C S M F
#> 1 yes yes yes NA NA NA NA NA
#> 2 NA yes yes NA NA NA NA NA
#> 3 NA NA NA NA NA NA yes NA
#> 4 NA NA NA NA NA NA NA NA
#> 5 NA NA yes NA NA NA NA NA
#> 6 NA yes yes NA NA NA NA NA
#> 7 NA NA yes NA NA yes yes NA
#> 8 NA NA NA NA NA NA NA NA
#> 9 NA yes yes NA NA NA NA NA
#> 10 NA yes yes NA NA NA NA NA
xl %>%
mutate_all(fct_explicit_na, "NA") %>%
str()
#> 'data.frame': 10 obs. of 8 variables:
#> $ G: Factor w/ 2 levels "yes","NA": 1 2 2 2 2 2 2 2 2 2
#> $ A: Factor w/ 2 levels "yes","NA": 1 1 2 2 2 1 2 2 1 1
#> $ L: Factor w/ 3 levels "no","yes","NA": 2 2 3 3 2 2 2 3 2 2
#> $ P: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
#> $ C: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
#> $ S: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 1 2 2 2
#> $ M: Factor w/ 2 levels "yes","NA": 2 2 1 2 2 2 1 2 2 2
#> $ F: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2