df1 <- data_frame(time1 = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),
time2 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
id = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j"))
df2 <- data_frame(time = sort(runif(100, 0, 10)),
C = rbinom(100, 1, 0.5))
对于df1中的每一行,我想找到df2中时间重叠的行,然后将这组df2行的中值C值分配给df1中的新列。我确定有一些简单的方法可以在功能之间使用dplyr来实现这一点,但我是R的新手,并且我们无法弄明白。谢谢!
答案 0 :(得分:0)
这是一种方法,使用merge
函数基本上执行SQL style cross join
,然后使用between
函数:
library(tidyverse)
merge(df1, df2, all = TRUE) %>%
rowwise() %>%
mutate(time_between = between(time, time1, time2)) %>%
filter(time_between) %>%
group_by(time1, time2, id) %>%
summarise(med_C = median(C))
使用filter
函数可能会导致df1
中的某些行丢失,因此另一种方法是:
merge(df1, df2, all = TRUE) %>%
rowwise() %>%
mutate(time_between = between(time, time1, time2)) %>%
group_by(time1, time2, id) %>%
summarise(med_C = median(ifelse(time_between, C, NA), na.rm = TRUE))
答案 1 :(得分:0)
您可以在基座R中使用sapply
执行此操作:
df1$median_c <- sapply(seq_along(df1$id), function(i) {
median(df2$C[df2$time > df1$time1[i] & df2$time < df1$time2[i]])
})