我想用以前的非NA值和“ Unclassified_”替换表中的NA

时间:2019-10-09 12:45:13

标签: r

使用R:我需要用左值最接近且“ unclassified_”填充所有NA单元格

下面的代码可以完美地填充w /最左边的值,但是我不知道如何将永久字符串“ Unclassified_”放在前面

library(zoo)
y <-  t(na.locf(t(x), fromLast=F)) #Fill NA cells with closest value to the left

示例数据

set.seed(1)
x <- data.frame(a=sample(c(1,2), 10, replace=T),
                b=sample(c(1,2,NA), 10, replace=T), 
                c=sample(c(1:5,NA), 10, replace=T))

给出df:

a   b   c
2   NA  1       
2   2   1       
1   NA  5       
1   1   2       
1   1   NA      
1   NA  4       
1   1   5       
2   2   NA      
2   2   NA      
1   NA  2   

我想要

a       b            c
2   unclassifed_2    1      
2   2                1      
1   unclassifed_1    5      
1   1                2      
1   1                unclassified_1     
1   unclassified_1   4      
1   1                5      
2   2                unclassified_2     
2   2                unclassified_2         
1   unclassified_1   2  

3 个答案:

答案 0 :(得分:2)

x$b[is.na(x$b)] = paste0("unclassifed_", x$a)
x$c[is.na(x$c)] = paste0("unclassifed_", x$a)

答案 1 :(得分:1)

您可以使用粘贴和索引。这个循环做到了

for( i in 1:ncol(x)){

if( any( is.na(  x[, i ]))){
x[ is.na( x[ , i ] )  , i ] <- 
    paste0( "unclassified_", x[ is.na( x[ , i ] )  , i-1 ] )
}
} 

答案 2 :(得分:0)

使用基础R解决此问题的一种方法:

for(j in 2:3){
for(i in 1:length(x$a)){
  if(is.na(x[i, j])){
  x[i, j] <- paste0("unclassified_", x[i, (j-1)])
  }
}
}

结果:

> x
   a              b              c
1  1              1 unclassified_1
2  1              1              2
3  2 unclassified_2              4
4  2              2              1
5  1 unclassified_1              2
6  2              2              3
7  2 unclassified_2              1
8  2 unclassified_2              3
9  2              2 unclassified_2
10 1 unclassified_1              3

数据:

set.seed(1)
x <- data.frame(a=sample(c(1,2), 10, replace=T),
                b=sample(c(1,2,NA), 10, replace=T), 
                c=sample(c(1:5,NA), 10, replace=T))
  

请注意,如果您对输入数据进行采样,则种子对于获得所需结果很重要。