Question

我有与此类似的数据

x1 <- data.frame(state = c("FL","FL","TX","TX"), county = c("Duval","Columbia","Dallam","Dimmit"))
x2 <- data.frame(state = c("FL","FL","FL","TX","TX","TX"), county = c("Duval","Columbia","Pinellas","Dallam","Dimmit","Duval"), UR = c(4,5,7,4,6,3))

x3 <- subset(x2, county %in% x1$county & state %in% x1$state)

我想要的结果是x1中的4个县与它们从x2中分配的UR相匹配。我的方法不排除出现在不同州的同名县。因此，有没有一个选项可以让我仅在州和县的组合匹配时进行过滤？

Answer 1

您要查找的是左联接：

> library(dplyr)
> left_join(x1, x2, by = c('state', 'county'))
  state   county UR
1    FL    Duval  4
2    FL Columbia  5
3    TX   Dallam  4
4    TX   Dimmit  6

或使用基数R中的merge：

> merge(x1, x2, all.x = T)
  state   county UR
1    FL Columbia  5
2    FL    Duval  4
3    TX   Dallam  4
4    TX   Dimmit  6

Answer 2

使用data.table

library(data.table)
setDT(x2)[x1, on = .(state, county)]
#   state   county UR
#1:    FL    Duval  4
#2:    FL Columbia  5
#3:    TX   Dallam  4
#4:    TX   Dimmit  6

如果合并的两列完全匹配其他两列，则子集数据帧

2 个答案: