R将函数应用于多个列

时间:2013-07-25 19:17:59

标签: r nas sapply

我有一个data.frame,其中包含多个列,包含1个数字(X0 1-5)或2个数字(X0 6)

> head(score)
    X0   X1   X2   X3   X4
1    8 <NA> <NA> <NA> <NA>
2    3 <NA> <NA> <NA> <NA>
3 <NA>    6    6 <NA> <NA>
4    6 <NA> <NA> <NA> <NA>
5    8 <NA> <NA> <NA> <NA>
6  3 4 <NA> <NA> <NA> <NA> <--- Note X0 has 2 numbers (3, 4) as characters

Split each XN column and create a YN column that is the sum of the split XN  
> score$Y0 <- sapply(strsplit(as.character(score$X0), split = " "), function(x) as.numeric(x[1]) + as.numeric(x[2]))

Where XN had no split value (i.e. it was only 1 number), replace YN with XN
> score$Y0 = with(df, ifelse(is.na(score$Y0), score$X0, score$Y0))

So the final variable YN (Y0) will be either X0, or the sum of X0 splits
> head(score)
X0   X1   X2   X3   X4   Y0
1    8 <NA> <NA> <NA> <NA>    8
2    3 <NA> <NA> <NA> <NA>    3
3 <NA>    6    6 <NA> <NA> <NA>
4    6 <NA> <NA> <NA> <NA>    6
5    8 <NA> <NA> <NA> <NA>    8
6  3 4 <NA> <NA> <NA> <NA>    7 <- sum of X0 numbers (3,4)

我可以手动执行此操作,但是如果我尝试将其包装到函数中以运行Y0:X0,Y1:X1,Y2:X2等,我会收到错误消息“由强制引入的NAs”。

for (i in 0:4) {
yvar = paste("score$Y",i,sep="")
xvar = paste("score$X",i,"sep="")
yvar <- sapply(strsplit(xvar,split=" "), function(x) as.numeric(x[1]) + as.numeric(x[2]))
yvar <- with(score, ifelse(is.na(yvar), xvar, yvar))
}

Warning messages:
1:  In FUN(X[[1L]], ...) : NAs introduced by coercion
2:  In FUN(X[[1L]], ...) : NAs introduced by coercion
3:  In FUN(X[[1L]], ...) : NAs introduced by coercion
4:  In FUN(X[[1L]], ...) : NAs introduced by coercion
5:  In FUN(X[[1L]], ...) : NAs introduced by coercion

我有很多不同的方法 - 如果我一个接一个地做,它会起作用,但不能让它作为一个功能的一部分工作。

2 个答案:

答案 0 :(得分:1)

想出来 - 感谢费迪南德:

> head(score) # Original Data
X1   X2   X0   X3   X4
1  5 1  6 1 <NA> <NA> <NA>
2    1  2 4 <NA> <NA> <NA>
3 <NA> <NA>    6 <NA> <NA>
4 <NA> <NA>    4 <NA> <NA>
5 <NA> <NA>  4 3 <NA> <NA>
6    1  2 4 <NA> <NA> <NA>

> nvars <- max(grep("^X\\d$", names(score)))-1 # Count the # of XN variables (-1)
> nvars
[1] 4

# For each variable, split and sum the resulting numbers
> for (i in 0:nvars) {
+ score[,paste0("Y",i)] <- sapply(strsplit(as.character(score[,paste0("X",i)]), split =     " "), function(x) sum(as.numeric(x)))
+ }
 # Final Data
> head(score)
    X1   X2   X0   X3   X4 Y0 Y1 Y2 Y3 Y4
1  5 1  6 1 <NA> <NA> <NA> NA  6  7 NA NA
2    1  2 4 <NA> <NA> <NA> NA  1  6 NA NA
3 <NA> <NA>    6 <NA> <NA>  6 NA NA NA NA
4 <NA> <NA>    4 <NA> <NA>  4 NA NA NA NA
5 <NA> <NA>  4 3 <NA> <NA>  7 NA NA NA NA
6    1  2 4 <NA> <NA> <NA> NA  1  6 NA NA

答案 1 :(得分:0)

请改用:

yvar = score[,paste0("Y",i)]
xvar = score[,paste0("X",i)]