根据另一列的位置从一组列中返回值

时间:2019-03-12 13:28:42

标签: r

我正在尝试根据另一列从一组列中提取值。以第一行为例: -取CodeToMatch的值= 1 -搜索以下列:Code.1Code.2Code.3以查找1的值。在这种情况下,它位于第三列中,因此从pCode.1pCode.2pCode3中返回第三列的值,即“ p4”

下面我的示例df中的expected_outcome列显示了我的追求。

非常感谢您的帮助!

c1 <- c("1","2","3")
c2 <- c("8","1","3")
c3 <- c("4","2","4")
c4 <- c("1","3","5")
c5 <- c("p1","p2","p3")
c6 <- c("p8","p1","p3")
c7 <- c("p4","p2","p4")
c8 <- c("p4","p1","p3")
df <- data.frame(c1,c2,c3,c4,c5,c6,c7,c8)
colnames(df)[c(1:8)] <- c("CodeToMatch","Code.1","Code.2","Code.3","pCode.1","pCode.2","pCode.3","expected_output")

3 个答案:

答案 0 :(得分:3)

data.table解决方案

样本数据

df <- structure(list(CodeToMatch = structure(1:3, .Label = c("1", "2", 
"3"), class = "factor"), Code.1 = structure(c(3L, 1L, 2L), .Label = c("1", 
"3", "8"), class = "factor"), Code.2 = structure(c(2L, 1L, 2L
), .Label = c("2", "4"), class = "factor"), Code.3 = structure(1:3, .Label = c("1", 
"3", "5"), class = "factor"), pCode.1 = structure(1:3, .Label = c("p1", 
"p2", "p3"), class = "factor"), pCode.2 = structure(c(3L, 1L, 
2L), .Label = c("p1", "p3", "p8"), class = "factor"), pCode.3 = structure(c(2L, 
1L, 2L), .Label = c("p2", "p4"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

代码

library(data.table)
#first, melt wide table to long format
df.melt <- melt( setDT(df), id.vars="CodeToMatch", measure.vars = patterns(Code="^Code\\..*", pCode="^pCode.*"))
#now finding everything is easy...
df.melt[ Code == CodeToMatch, .(CodeToMatch, pCode)]

输出

#    CodeToMatch pCode
# 1:           3    p3
# 2:           2    p1
# 3:           1    p4

答案 1 :(得分:0)

我不知道这有多概括,但这是一个选择

nCode <- 3
df$expected_output <- apply(df, 1, function(x) x[nCode + 1 + which(x[2:(nCode + 1)] == x[1])])
df$expected_output
#[1] "p4" "p1" "p3"

请注意,“代码”列的数量是硬编码的。在您的情况下,您有3个"Code"列与匹配的"pCode"列。根据需要进行调整。这也假定第一列始终包含要匹配的代码号。

答案 2 :(得分:0)

根据名称中的模式分隔code和pCode列。找出每行CodeToMatch中的code_columns的索引,并使用pcode_columns从中提取相应的mapply

code_columns <- grep("^Code\\.[0-9]+", names(df))
pcode_columns <- grep("^pCode", names(df))

mapply(function(x, y) df[x, pcode_columns][df[x, code_columns]==y],
                       1:nrow(df), df$CodeToMatch)

#[1] "p4" "p1" "p3"

Ran

df[1:4] <- lapply(df[1:4], function(x) as.numeric(as.character(x)))

将数字列保留为数字而非因数。