通过r中的其他列修改列值

时间:2017-09-21 17:21:32

标签: r dataframe data.table

我有一个CSV表(作为数据框)。我想通过其他列值修改特定的列值。

我准备了一个代码,但它不起作用。 数据框包含1076行和156列。

公式必须如下:

if (a[i,"0Q-state"] == "done" ) && (a[i,0Q-01] == NA))  a[i,0Q-01] = 0;
     else a[i,0Q-01] = a[i,0Q-01];

但我不知道如何在r中这样做。

 >dataset4
       0Q-state   0Q-01 0Q-02 0Q-03 0Q-04 0Q-05 0Q-06 0Q-07 0Q-08 0Q-09
   1:  done        1     1     1     1     1     1     1     1     NA
   2:              1     1     1     1     1     1     NA    1     1
   3:  done        1     1     1     NA    1     1     1     1     1
   5:  done        1     1     1     1     0     0     0     1     0
   6:  done        1     1     1     1     0     0     0     1     0
   7:              1     1     NA    1     0     0     0     1     0
   8:  done        1     1     1     1     0     0     0     1     0


   sapply(c("0Q-01","0Q-02","0Q-03","0Q-04","0Q-05","0Q-06","0Q-07","0Q-08","0Q-09"), 
   function(y) {
   dataset4[,y] <- sapply(c(1:1076), function(x) 
   ifelse (((is.na(dataset4[x,y])) && (dataset4[x,c("0Q-state")] == "done")) 
   ,0, dataset4[x,y]))}
    )

输出必须是:

     >dataset4
       0Q-state   0Q-01 0Q-02 0Q-03 0Q-04 0Q-05 0Q-06 0Q-07 0Q-08 0Q-09
   1:  done        1     1     1     1     1     1     1     1     0
   2:              1     1     1     1     1     1     NA    1     1
   3:  done        1     1     1     0     1     1     1     1     1
   5:  done        1     1     1     1     0     0     0     1     0
   6:  done        1     1     1     1     0     0     0     1     0
   7:              1     1     NA    1     0     0     0     1     0
   8:  done        1     1     1     1     0     0     0     1     0

2 个答案:

答案 0 :(得分:1)

我们可以尝试:

df[rep(df[, 1] == "done", ncol(df)) & is.na(df)] <- 0
df

1     done        1     1     1     1     1     1     1     1     0
2                 1     1     1     1     1     1     NA    1     1
3     done        1     1     1     0     1     1     1     1     1
4     done        1     1     1     1     0     0     0     1     0
5     done        1     1     1     1     0     0     0     1     0
6                 1     1     NA    1     0     0     0     1     0
7     done        1     1     1     1     0     0     0     1     0

或使用sapply()

myFunc <- function(x, y) ifelse(is.na(x) & y == "done", 1, x)  
data.frame(df[, 1], sapply(df[, -1], myFunc, y = df[, 1]))

1 done  1  1  1  1  1  1  1  1  NA
2       1  1  1  1  1  1 NA  1   1
3 done  1  1  1 NA  1  1  1  1   1
4 done  1  1  1  1  0  0  0  1   0
5 done  1  1  1  1  0  0  0  1   0
6       1  1 NA  1  0  0  0  1   0
7 done  1  1  1  1  0  0  0  1   0

您可以随时将df[, 1]替换为df[, "0Q-state"],将df[, -1]替换为df[, namesOfDummyVars]

答案 1 :(得分:0)

问题已使用data.table标记,dataset4的打印输出表明dataset4已经是data.table个对象。

以下是data.table语法中的三种变体,用于替换标记为&#34;已完成&#34;行的NAs。

# create vector of names of columns to be changed 
cols <- sprintf("0Q-%02i", 1:9)

# variant 1
dataset4[`0Q-state` == "done", 
         (cols) := lapply(.SD, function(x) replace(x, is.na(x), 0L)), 
         .SDcols = cols][]
   0Q-state 0Q-01 0Q-02 0Q-03 0Q-04 0Q-05 0Q-06 0Q-07 0Q-08 0Q-09
1:     done     1     1     1     1     1     1     1     1     0
2:              1     1     1     1     1    NA     1     1    NA
3:     done     1     1     1     0     1     1     1     1     1
4:     done     1     1     1     1     0     0     0     1     0
5:     done     1     1     1     1     0     0     0     1     0
6:              1    NA     1     0     0     0     1     0    NA
7:     done     1     1     1     1     0     0     0     1     0

# variant 2
lapply(cols, function(i) dataset4[`0Q-state` == "done" & is.na(get(i)), (i) := 0L])
dataset4

返回与上面相同的内容

# variant 3  --- data.table development version 1.10.5
for (i in cols) 
  set(dataset4, which(dataset4[, "0Q-state"] == "done" & is.na(dataset4[, ..i])), i, 0L)
dataset4