我想根据不同数据框('data'
)中的多个值过滤一个数据框('key'
)。
我的'key'
看起来像这样
exhibit.name <- c("lions", "otters", "penguins")
exhibit.start <- c(as.Date("2016-04-01"), as.Date("2016-05-01"), as.Date("2016-06-01"))
exhibit.end <- c(as.Date("2016-04-30"), as.Date("2016-05-31"), as.Date("2016-06-30"))
key <- data_frame(exhibit.name, exhibit.start, exhibit.end)
我的'data'
看起来像这样
exhibit.name <- c("lions", "lions", "otters",
"otters", "penguins", "penguins")
exhibit.date <- c(as.Date("2016-04-15"), as.Date("2016-12-15"), as.Date("2016-05-15"),
as.Date("2016-02-15"), as.Date("2016-06-15"), as.Date("2016-10-15"))
data <- data_frame(exhibit.name, exhibit.date)
我需要过滤'data'
以返回data$exhibit.name
匹配key$exhibit.name
且其data$exhibit.date
属于相关key$exhibit.start
和key$exhibit.end
日期的行。结果数据框如下所示:
> valid.exhibits
1|lions |2016-04-15
2|otters |2016-05-15
3|penguins|2016-06-15
谢谢!
答案 0 :(得分:4)
我们可以left_join
然后filter
data %>%
left_join(., key) %>%
filter(exhibit.start < exhibit.date, exhibit.end > exhibit.date) %>%
select(1:2)
# exhibit.name exhibit.date
# <chr> <date>
#1 lions 2016-04-15
#2 otters 2016-05-15
#3 penguins 2016-06-15
我们也可以使用非equi(data.table的开发版本的条件连接),即v1.9.7 +
library(data.table)
setDT(key)
setDT(data)[key, on = .(exhibit.name, exhibit.date > exhibit.start,
exhibit.date < exhibit.end), new := 1]
na.omit(data)[, new := NULL][]
# exhibit.name exhibit.date
#1: lions 2016-04-15
#2: otters 2016-05-15
#3: penguins 2016-06-15