通过以下链接,但它部分解决了我的问题。
merge multiple TRUE/FALSE columns into one
Combining a matrix of TRUE/FALSE into one
R: Converting multiple boolean columns to single factor column
我有一个类似于:
的数据框dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
A = c('Y','N','N','N','N','N','N','N'),
B = c('N','Y','N','N','N','N','Y','N'),
C = c('N','N','Y','N','N','Y','N','N'),
D = c('N','N','N','Y','N','Y','N','N'),
E = c('N','N','N','N','Y','N','Y','N')
)
我想用一列重塑我的df但是当连续有2个“Y”时它必须给出优先级。
优先级是A> B> C> D> E这意味着如果它们在A中是“Y”那么结果值应该是A.类似地,在上面的例子中,d和C都有“Y”但是应该在得到的df中为“C”。 因此输出应该如下:
resultant_dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
Result = c('A','B','C','D','E','C','B','NA')
)
我试过这个:
library(reshape2)
new_df <- melt(dat, "Id", variable.name = "Result")
new_df <-new_df[new_df$value == "Y", c("Id", "Result")]
但问题是不处理优先级问题,它为同一个Id创建了2行。
答案 0 :(得分:2)
tmp = data.frame(ID = dat[,1],
Result = col_order[apply(
X = dat[col_order],
MARGIN = 1,
FUN = function(x) which(x == "Y")[1])],
stringsAsFactors = FALSE)
tmp$Result[is.na(tmp$Result)] = "Not Present"
tmp
# ID Result
#1 1 A
#2 2 B
#3 3 C
#4 4 D
#5 5 E
#6 6 C
#7 7 B
#8 8 Not Present