用零-2级不同列r替换缺失值

时间:2018-04-17 00:20:20

标签: r missing-data

我有一个包含多个列的数据集。当JOB = Student和HOME_VAL = 0

时,我想替换缺失值

LOGIC如果JOB ==学生和HOME_VAL ==''   那么HOME_VAL == 0

if(DT$JOB == 'Student' & DT$HOME_VAL=='') {
    DT$HOME_VAL<-0
}

数据

HOME_VAL  JOB
$9999     Student
 $100     Home Maker
          Student
 $2000    Home Maker
          Student
 $60000   Student
 $40000   Professor

寻找

HOME_VAL  JOB
$9999     Student
 $100     Home Maker
 0        Student
 $2000    Home Maker
 0        Student
 $60000   Student
 $40000   Professor

1 个答案:

答案 0 :(得分:2)

我们可以使用dplyr::mutate

 library(dplyr);
 library(magrittr);
 df %>%
    mutate_if(is.factor, as.character) %>%
    mutate(HOME_VAL = ifelse(HOME_VAL == "" & JOB == "Student", 0, HOME_VAL))
#  HOME_VAL        JOB
#1    $9999    Student
#2     $100 Home Maker
#3        0    Student
#4    $2000 Home Maker
#5        0    Student
#6   $60000    Student
#7   $40000  Professor

说明:mutate_iffactor列转换为character列,mutate + ifelse根据您的逻辑进行替换。

或在基地R:

df$HOME_VAL = as.character(df$HOME_VAL);
df$HOME_VAL = ifelse(df$HOME_VAL == "" & df$JOB == "Student", 0, df$HOME_VAL);

样本数据

df <- read.table(text =
   "HOME_VAL  JOB
$9999     Student
$100     'Home Maker'
''          Student
$2000    'Home Maker'
''          Student
$60000   Student
$40000   Professor", header = T)