匹配两列的行

时间:2016-01-21 11:37:18

标签: r duplicates match

给定数据框

df=data.frame(
              E=c(1,1,2,1,3,2,2), 
              N=c(4,4,10,4,3,2,2)
              )

我想创建第三列:每当一个值等于同一列中的另一个值,并且这些行在另一列中也相等时,它会产生匹配(每个匹配的新字符)。

dfx=data.frame(
               E=c(1,1,2,1,3,2,2,3, 2), 
               N=c(4,4,10,4,3,2,2,6, 10),
               matched=c("A", "A", "B","A", NA, "C", "C", NA, "B")
               )

谢谢!

1 个答案:

答案 0 :(得分:1)

此处df为:

df <- structure(list(E = c(1, 1, 2, 1, 3, 2, 2, 3, 2), N = c(4, 4, 
        10, 4, 3, 2, 2, 6, 10)), .Names = c("E", "N"), row.names = c(NA, 
        -9L), class = "data.frame")

你可以这样做:

dfx <- transform(df, matched = {
  i <- as.character(interaction(df[c("E", "N")]))
  tab <- table(i)[order(unique(i))]
  LETTERS[match(i, names(tab)[tab > 1])]
})

#   E  N matched
# 1 1  4       A
# 2 1  4       A
# 3 2 10       B
# 4 1  4       A
# 5 3  3    <NA>
# 6 2  2       C
# 7 2  2       C
# 8 3  6    <NA>
# 9 2 10       B