从Dataframe y开始:
x <- c(2,NA,6,8,9,10)
y <- data.frame(letters[1:6], 1:6, NA, 3:8, NA, x, NA)
colnames(y) <- c("Patient", "C1", "First_C1", "C2", "First_C2", "C3", "First_C3")
我希望R查看C1的每个元素,找出第一个患有该元素的患者(第一行)以及它所在列中的标识,并将“coordinates”“Patient_Column”添加到First_element_C1 ......然后,对C2和C3做同样的事。
所以,结果应该是这样的:
y$First_C1 <- c("a_C1", "a_C3", "a_C2", "b_C2", "c_C2", "c_C3")
y$First_C2 <- c("a_C2", "b_C2", "c_C2", "c_C3", "e_C2", "d_C3")
y$First_C3 <- c("a_C3", NA, "c_C3", "d_C3", "e_C3", "f_C3")
我不知道如何编写代码,甚至不知道如何搜索代码......有人可以帮助我吗?
答案 0 :(得分:2)
我们从没有输出列的y
开始:
y<-structure(list(Patient = structure(1:6, .Label = c("a", "b",
"c", "d", "e", "f"), class = "factor"), C1 = 1:6, C2 = 3:8, C3 = c(2,
NA, 6, 8, 9, 10)), .Names = c("Patient", "C1", "C2", "C3"), row.names = c(NA,
-6L), class = "data.frame")
然后,我们可以尝试:
y[paste0("First_C",1:3)]<-lapply(y[,2:4],
function(x) {
d<-arrayInd(match(x,t(y[,2:4])),dim(t(y[,2:4])))[,2:1]
paste(y$Patient[d[,1]],colnames(y[,2:4])[d[,2]],sep="_")
})
y[,5:7][is.na(y[,2:4])]<-NA
# Patient C1 C2 C3 First_C1 First_C2 First_C3
#1 a 1 3 2 a_C1 a_C2 a_C3
#2 b 2 4 NA a_C3 b_C2 <NA>
#3 c 3 5 6 a_C2 c_C2 c_C3
#4 d 4 6 8 b_C2 c_C3 d_C3
#5 e 5 7 9 c_C2 e_C2 e_C3
#6 f 6 8 10 c_C3 d_C3 f_C3