我想计算ds $ date_fixed和ds $ date_broken之间的平均燃烧时间(燃烧时间$ hours_burned)。我知道我可以使用下面的代码来计算:
ds$average_burninghours <- sapply (interval(ds$date_fixed, ds$date_broken), function(i)
mean (burning_hours$hours_burned[burning_hours$date%within%i]))
但我想根据位置和位置计算burning_hours。 所以,我想添加一些代码看lke:'group_by = c(location,position)',但我无法做到这一点。有人对此有想法吗?
示例代码:
ds <- data.frame( date_fixed= c("16-3-2015", "19-3-2015", "21-3-2015"),
date_broken = c("18-3-2015", "22-3-2015", "24-3-2015"),
location = c("A", "B", "B"), position = c("1", "2", "2"))
burning_hours <- data.frame(date = c("16-3-2015", "16-3-2015", "17-3-2015", "17-3-2015",
"18-3-2015", "18-3-2015", "19-3-2015", "19-3-2015", "20-3-2015",
"20-3-2015", "21-3-2015", "21-3-2015", "22-3-2015", "22-3-2015",
"23-3-2015", "23-3-2015", "24-3-2015", "24-3-2015"),
hours_burned= c("10", "11"), location = c("A", "B"),
position = c("1", "2"))
期望的结果:
date_fixed date_broken location position avg_burninghours
16-3-2015 18-3-2015 A 1 10
19-3-2015 22-3-2015 B 2 11
21-3-2015 24-3-2015 B 2 11
答案 0 :(得分:0)
合并两个数据帧,然后合并子集。
library(dplyr)
library(lubridate)
#Cleaning
ds$date_fixed <- dmy(ds$date_fixed)
ds$date_broken <- dmy(ds$date_broken)
burning_hours$date <- dmy(burning_hours$date)
burning_hours$hours_burned <- as.numeric(as.character(burning_hours$hours_burned))
df <- merge(burning_hours,ds,by = c('location','position'))
df %>%
group_by(date_fixed,date_broken,location,position) %>%
filter(date >= date_fixed,date <= date_broken) %>%
summarise(avg_burninghours = mean(hours_burned))
给出了:
date_fixed date_broken location position avg_burninghours
(date) (date) (fctr) (fctr) (dbl)
1 2015-03-16 2015-03-18 A 1 10
2 2015-03-19 2015-03-22 B 2 11
3 2015-03-21 2015-03-24 B 2 11