根据其他列中的NA值创建新列

时间:2016-03-25 17:55:35

标签: r

我想在其他列中基于NA创建另一个列。以下是一个例子:

df <- replicate(5,rnorm(4))    
df[1,3:4] <- NA    
df[2:3,1:2] <- NA    
colnames(df)[1:5] <- c("One","Two","Three","Four","Five")   
df
      One   Two Three  Four  Five
[1,] 0.12 -0.38    NA    NA  0.10
[2,]   NA    NA -0.19 -0.14 -1.57
[3,]   NA    NA  1.01  0.22  0.27
[4,] 0.53  0.71 -0.86 -0.33 -1.01

每列都有固定的指定权重:

weightc1 <- 0.1    
weightc2 <- 0.3    
weightc3 <- 0.2    
weightc4 <- 0.35    
weightc5 <- 0.05`

我想让每列中的NA等于相应的列权重。例如。第1列中的NA为0.1。

然后,我想创建另一个列(称为Six),它等于NA权重之和。例如,第6列的第一行应为0.55(0.2 + 0.35)。最后一行没有NAs,等于0.该列应如下所示:

df2 <- cbind(df, Six = c("0.55","0.4","0.4","0"))
df2
     One                 Two                  Three                Four                 Five                Six   
[1,] "0.123127305724018" "-0.378163368890999" NA                   NA                   "0.100592613978267" "0.55"
[2,] NA                  NA                   "-0.190601356688205" "-0.136015883223294" "-1.56573577576604" "0.4" 
[3,] NA                  NA                   "1.01441506421936"   "0.220154629517149"  "0.273740027540685" "0.4" 
[4,] "0.529632731861426" "0.709285638700681"  "-0.864741163519668" "-0.327865814162575" "-1.01298096772074" "0" 

我尝试了IfesleSix&lt; - ifelse(df $ One == NA,&#34; weightc1&#34;,&#34;&#34;),它用NA替换了第一列中的所有数字。我知道在应用求和函数之前我需要首先解决这个问题(或者有解决方法吗?)。请指教。谢谢!

2 个答案:

答案 0 :(得分:1)

我们在list(使用mget)中获取所有'weightc'对象的值,将'df'转换为data.frame,然后将'weightc'的每个元素相乘list使用相应的“df”列(在将其转换为带有is.na的逻辑向量之后),并使用Reduce获取总和。

Reduce(`+`,Map(function(x,y) y*is.na(x), 
    as.data.frame(df), mget(ls(pattern='weightc\\d+'))))

或者我们可以在is.na(df)之后将逻辑矩阵(list)与'weightc'的复制unlist相乘,然后执行rowSums

rowSums(unlist(mget(ls(pattern="weightc\\d+"))[col(df)])*is.na(df))
#[1] 0.55 0.40 0.40 0.00

答案 1 :(得分:1)

结果也可以用矩阵向量乘积获得:

weights <- c(0.1,0.3,0.2,0.35,0.05)
df2 <- cbind(df, Six=c(is.na(df) %*% weights))
#            One        Two      Three        Four       Five  Six
#[1,]  1.0103788 0.07835063         NA          NA -1.9312272 0.55
#[2,]         NA         NA  1.4426233 -0.55698776  1.0897613 0.40
#[3,]         NA         NA -0.3756296 -1.18399257  0.6567973 0.40
#[4,] -0.1799107 0.46225181  1.3530630  0.09264794 -0.3004309 0.00