数据框r-多个匹配代码

时间:2018-07-06 09:45:43

标签: r

我有两个要连接的数据框。

第一个是:

V1 <- c("AB1", "AB2", "AB3" ,"AB4" ,"AB5" ,"AB6" ,"AB7","AB6","AB9" ,"AB10")
df1 <- data.frame(V1)

第二个是:

V5 <- c("AB1","","","", "AB3", "AB4", "AB5", "AB6")
V6 <- c("AB","AB2","","AB", "", "AB", "", "AB")
V7 <- c("AB","AB","AB","", "AB", "", "AB", "AB")
V8 <- c(1,2,2,2,3,4,5,6)

df2 <- data.frame(V5,V6, V7, V8)

我尝试在V5,V6和V7列中的df2中从df1查找V1,并从df2返回V8,并添加yes(当df $ V1在df2中时)。

所需的结果是:

V df1$V1    res df$V8   Yes/no
AB1        1            1
AB2        2            1
AB3        3            1
AB4        4            1
AB5        5            1
AB6        6            1
AB7                     0
AB6                     0
AB9                     0
AB10                    0

我有以下代码,但我不能同时使它们仅同时作用于df2中的3列?

df1$res[match(df2$V5,df1$V1, nomatch=0)] <- df2$V6[match(df2$V5,df1$V1, nomatch = 0)]

1 个答案:

答案 0 :(得分:1)

V1 <- c("AB1", "AB2", "AB3" ,"AB4" ,"AB5" ,"AB6" ,"AB7","AB6","AB9" ,"AB10")
df1 <- data.frame(V1, stringsAsFactors = F)

V5 <- c("AB1","","","", "AB3", "AB4", "AB5", "AB6")
V6 <- c("AB","AB2","","AB", "", "AB", "", "AB")
V7 <- c("AB","AB","AB","", "AB", "", "AB", "AB")
V8 <- c(1,2,2,2,3,4,5,6)
df2 = data.frame(V5,V6,V7,V8, stringsAsFactors = F)

library(tidyverse)

df2 %>%
  gather(v, V1, -V8) %>%           # reshape dataset
  select(-v) %>%                   # remove unecessary variable
  right_join(df1, by="V1") %>%     # join df1
  mutate(YesNo = ifelse(is.na(V8), 0, 1)) %>%   # create Yes/No variable
  distinct() %>%                   # select distinct rows
  select(V1, V8, YesNo)            # arrange columns

#     V1 V8 YesNo
# 1  AB1  1     1
# 2  AB2  2     1
# 3  AB3  3     1
# 4  AB4  4     1
# 5  AB5  5     1
# 6  AB6  6     1
# 7  AB7 NA     0
# 8  AB9 NA     0
# 9 AB10 NA     0

如果从代码中删除distinct(),则会得到df1的所有行(而不是不同的行)。