R构造列的值汇总

时间:2015-06-16 18:07:11

标签: r dataframe unique

我想创建一个数组,该数组总结了数据帧的行,其中包含所述行中包含的唯一值。

示例以下示例代码:

ref <- c(1:8)

data1 <- c("A","","C","","","","A","")
data2 <- c("A","","","A","C","","","")
data3 <- c("","B","","","","","","B")
data4 <- c("A","B","","","","D","A","")

initial.data <- data.frame(ref, data1, data2, data3, data4)

我可以获得我想要的东西:

summary.data <- paste(initial.data[,2], initial.data[,3], 
                  initial.data[,4], initial.data[,5], sep='') 

desired.data <- substring(summary.data,1,1)

但是,我想要一种更简约的编码方式,而不是假设每行只能取一个值。

1 个答案:

答案 0 :(得分:0)

你可以尝试

 apply(initial.data[-1],1, function(x) unique(x[x!='']))
 #[1] "A" "B" "C" "A" "C" "D" "A" "B"

或者

 substr(do.call(paste0, initial.data[-1]),1,1)
 #[1] "A" "B" "C" "A" "C" "D" "A" "B"

或使用max.col

 initial.data[cbind(1:nrow(initial.data),max.col(initial.data[-1]!='')+1)]
 #[1] "A" "B" "C" "A" "C" "D" "A" "B"