我有一个有三列的df。每列都有一个字符或NA,每行只有一个字符。就像这个例子:
df <- data.frame(a=c("NA","NA","NA","NA","fruits","fruits","fruits","fruits","fruits","fruits"),
b=c("NA","NA","veggies","veggies","NA","NA","NA","NA","NA","NA"),
c=c("nuts","nuts","NA","NA","NA","NA","NA","NA","NA","NA") )
我想要将所有三列合并,以获得此结果:
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
使用数值我会aggregate
使用na.rm=TRUE
。但是,我不知道如何用字符来做这件事。想法?谢谢
答案 0 :(得分:1)
我们可以在将字符串“NA”转换为真实max.col
后使用NA
。我们使用max.col
获取行/列索引,提取值,然后转换为data.frame
。
is.na(df) <- df=='NA'
data.frame(var=df[cbind(1:nrow(df),max.col(!is.na(df)))])
# var
#1 nuts
#2 nuts
#3 veggies
#4 veggies
#5 fruits
#6 fruits
#7 fruits
#8 fruits
#9 fruits
#10 fruits
或另一种选择是
data.frame(var= df[cbind(1:nrow(df),(+!is.na(df)) %*% seq_along(df))])
答案 1 :(得分:0)
要完善评论中提供的想法,您可以这样做:
data.frame(var = apply(df, 1, function(x) paste(gsub("NA", "", x), collapse = "")) )
var
1 nuts
2 nuts
3 veggies
4 veggies
5 fruits
6 fruits
7 fruits
8 fruits
9 fruits
10 fruits
答案 2 :(得分:0)
实际数据情况可能决定是否比逐行方法更好或更差。这是获得打印输出的一种方式,如您指定的那样:
> as.matrix( df[df!="NA"] )
或者可能更好:
> cat( paste( "\n", df[ df!="NA" ] ) )
fruits
fruits
fruits
fruits
fruits
fruits
veggies
veggies
nuts
nuts