我有一个数据集,其中'workclass'列具有以下值:
现在在我看来,'privat'的价值实际上与'Private'相同,所以我想相应地改变它
如果我运行以下命令,则会收到错误消息,因为未定义因子。
> adult$workclass[adult$workclass == 'privat'] <- 'Private'
Warning message:
In `[<-.factor`(`*tmp*`, adult$workclass == "privat", value = c(7L, :
invalid factor level, NA generated
如果我对该列进行“解构”并在操作后再次“重构”该列,我最终将为“私有”设置两个不同的因素。
> adult$workclass <- as.character(adult$workclass)
> adult$workclass[adult$workclass=='privat'] <- 'Private'
> adult$workclass <- as.factor(adult$workclass)
> summary(adult$workclass)
Federal-gov Local-gov Never-worked Private
960 2093 7 22686
Self-emp-inc Self-emp-not-inc State-gov Without-pay
1116 2541 1298 14
Private NA's
10 1836
如何合并'privat'和'Private'?
答案 0 :(得分:0)
levels(adult$workclass)
的输出是什么?看起来您的“私人”级别与字符串“私有”不完全相同。
当我运行以下代码时,我得到了所需的结果:
f <- data.frame(f=factor(c(
rep("Federal-gov", 960),
rep("Local-gov", 2093),
rep("Never-worked", 7),
rep("Private", 22686),
rep("Self-emp-inc", 1116),
rep("Self-emp-not-inc", 2541),
rep("State-gov", 1298),
rep("Without-pay", 14),
rep("privat", 10),
rep("NA's", 1836)
)))
f$f[f$f=="privat"] <- "Private"
f <- droplevels(f)
table(f)
Federal-gov Local-gov NA's Never-worked
960 2093 1836 7
Private Self-emp-inc Self-emp-not-inc State-gov
22696 1116 2541 1298
Without-pay
14
答案 1 :(得分:0)
您可以尝试:
library(dplyr)
adult %>%
mutate(workclass = recode_factor(workclass, privat = "Private"))