如何根据特定列的条件替换数据框中的多个值?

时间:2015-09-26 00:23:56

标签: r loops if-statement matrix indexing

这是我的数据集:

> data<-read.csv(file.choose())
> data$MaxDate<-as.character(data$MaxDate)
> data$Batch<-gsub(" ", "\\.",data$Batch)
> p<-data[1:5,]
> p
           ManagerName     Employee.Name Employee.ID   MaxDate    Batch
1 Abarrientos,  Claire    Vinnikov, Olga       32403 8/11/2015 Batch.47
2         Adel, Bonnie      Adams, Tracy      201850                   
3         Adel, Bonnie    Black, Chantal      213746 7/29/2011 Batch.17
4         Adel, Bonnie  Brandoli, Morena      201990 7/29/2011 Batch.17
5         Adel, Bonnie Campbell, Melissa      201931                   
  X.New.Employee.EHS.Document.Sign.off Batch.01 Batch.02 Batch.03 Batch.04
1                                    1       NA       NA       NA       NA
2                                   NA        1        1        1        1
3                                    1        1        1        1        1
4                                    1        1        1        1        1
5                                   NA        1        1        1        1
  Batch.06 Batch.07 Batch.08 Batch.09 Batch.10 Batch.11 Batch.18 Batch.19
1       NA       NA       NA       NA       NA       NA       NA       NA
2        1        1        1        1        1        1        1        1
3        1        1        1        1        1        1        1        1
4        1        1        1        1        1        1        1        1
5        1        1        1        1        1        1        1        1
  Batch.20 Batch.22 Batch.24 Batch.25 Batch.26 Batch.27 Batch.28 Batch.29
1       NA       NA       NA       NA       NA       NA       NA       NA
2        1        1        1        1        1        1        1        1
3        1        1        1        1        1        1        1        1
4        1        1        1        1        1        1        1        1
5        1        1        1        1        1        1        1        1
  Batch.30 Batch.31 Batch.32 Batch.33 Batch.34 Batch.35 Batch.36 Batch.37
1       NA       NA       NA       NA       NA       NA       NA       NA
2        1        1        1        1        1        1        1        1
3        1        1        1        1        1        1        1        1
4        1        1        1        1        1        1        1        1
5        1        1        1        1        1        1        1        1
  Batch.38 Batch.39 Batch.40 Batch.41 Batch.42 Batch.43 Batch.44 Batch.45
1       NA       NA       NA       NA       NA       NA       NA       NA
2        1        1        1        1        1        1        1        1
3        1        1        1        1        1        1        1        1
4        1        1        1        1        1        1        1        1
5        1       NA       NA       NA       NA       NA       NA       NA
  Batch.46 Batch.47
1       NA       NA
2        1        1
3       NA       NA
4        1        1
5       NA       NA

我是新手R用户,我正在试图找出如何在给定批号的情况下将所有列值更改为1。例如,对于第一行,批号为“Batch.47”。因此,我想要将“Batch.1”,“Batch.2”,“Batch.3”列中的所有值替换为“Batch.47”到“1”。但是,我只想对New.Employee.Sign.Off列下值为“1”的行执行此操作。对于第二行,没有相应的批号,因为“Adams,Tracy”在员工签收下有“NA”。因此,我希望这个特定的行保持不变。请记住,并非所有批号都包含在内。例如,没有批号13到17。

到目前为止,这是我的代码:

for (i in 1:nrow(p)) {
  if (p$X.New.Employee.EHS.Document.Sign.off[i] == 1) {
    k<-which(colnames(p)==p$Batch[i])
    p[i,]<-replace(p[i,],6:k[i],1)
    i=i+1
  }
  else if (is.na(p$X.New.Employee.EHS.Document.Sign.off[i])) {
    i=i+1
  }
}

这会产生以下错误:

Error in if (p$X.New.Employee.EHS.Document.Sign.off[i] == 1) { : 
  missing value where TRUE/FALSE needed

非常感谢任何指导。非常感谢并非常感谢!

以下是数据集的属性:

> str(data)
'data.frame':   3372 obs. of  44 variables:
 $ ManagerName                         : Factor w/ 209 levels "Abarrientos,  Claire",..: 1 2 2 2 2 2 2 2 2 2 ...
 $ Employee.Name                       : Factor w/ 3371 levels "Abas, Ma Cecilia",..: 3155 14 304 346 455 648 850 934 1021 1089 ...
 $ Employee.ID                         : Factor w/ 3368 levels "(blank)","0",..: 3257 278 2025 359 325 3092 1695 2075 1043 1196 ...
 $ MaxDate                             : chr  "8/11/2015" "" "7/29/2011" "7/29/2011" ...
 $ Batch                               : chr  "Batch.47" "" "Batch.17" "Batch.17" ...
 $ X.New.Employee.EHS.Document.Sign.off: int  1 NA 1 1 NA 1 1 NA NA 1 ...
 $ Batch.01                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.02                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.03                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.04                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.06                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.07                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.08                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.09                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.10                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.11                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.18                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.19                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.20                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.22                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.24                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.25                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.26                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.27                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.28                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.29                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.30                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.31                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.32                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.33                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.34                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.35                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.36                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.37                            : int  NA 1 1 1 1 NA 1 1 1 1 ...
 $ Batch.38                            : int  NA 1 1 1 1 NA NA 1 1 1 ...
 $ Batch.39                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.40                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.41                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.42                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.43                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.44                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.45                            : int  NA 1 1 1 NA NA 1 1 1 1 ...
 $ Batch.46                            : int  NA 1 NA 1 NA NA NA NA NA 1 ...
 $ Batch.47                            : int  NA 1 NA 1 NA 1 1 1 NA NA ...
> 

1 个答案:

答案 0 :(得分:1)

colnames(p) -> clnames
clnames[grep("Batch.", x = clnames)] -> Batchvec
apply(p[,Batchvec], 2, 
function(x) ifelse(p$X.New.Employee.EHS.Document.Sign.off == 1, 1, 0)
) -> newp

然后将newp绑定到p中的列,这些列不以“批处理”等开始......在...

cbind(p[,"employee.sign"], newp)