如何使用contains或regexp条件的两个数据帧

时间:2016-01-14 10:13:34

标签: r join

我有2个data.frame

id            country
 1    United State
 2  United Kingdom
 3          Russia
 4      Belorussia

group_condition   group
         United  group1
           Russ  group2

我想得到这个

id         country  group
 1    United State  group1
 2  United Kingdom  group1
 3          Russia  group2
 4      Belorussia  group2

我怎么能这样做?

2 个答案:

答案 0 :(得分:1)

有点丑,但有效:

temp = sapply(group_condition,grepl,x=country,ignore.case=T)
new_group_col = group[max.col(temp)]

答案 1 :(得分:0)

我这样做虽然可能存在更优雅的解决方案

id_c <- data.frame(id = 1:4, country = c("United State","United Kingdome","Russia","Belorussia"))
gc_g <- data.frame(group_condition = c("United","Russ"), group = c("group1","group2"))

# first find the rows that match the expressions
matches <- lapply(gc_g$group_condition,grep,id_c$country,ignore.case=T)

# then for each of them add the group condition to the first table
for (i in 1:nrow(gc_g)){
  id_c$group[matches[[i]]] <- gc_g$group_condition[i]
}