我正在尝试根据列中的日期计算平均值。可以选择前几天的数量,例如4天。获取前4个记录的平均值减去StartDate并将平均值向下滚动,直到有EndDate。
我正在尝试
tapply(df$Boe, df$ShutinDate, function(x) mean(tail(sort(x), 5)))
功能,但我得不到正确的平均值。
输出
Name DATE Values StartDate EndDate Average
TestA 3/3/2017 50
TestA 3/4/2017 75
TestA 3/5/2017 25
TestA 3/6/2017 100
TestA 3/7/2017 100
TestA 3/8/2017 50
TestA 3/9/2017 80
TestA 3/10/2017 90
TestA 3/11/2017 25 3/11/2017
TestA 3/12/2017 0 80
TestA 3/13/2017 0 80
TestA 3/14/2017 0 80
TestA 3/15/2017 0 80
TestA 3/16/2017 50 3/16/2017
答案 0 :(得分:1)
1)我们按名称分组(假设rollapply
应为每个Name
单独完成),然后将width = list(-seq(4))
与rollapply
一起使用对mean
的每个应用程序使用偏移-1,-2,-3,-4。 (偏移0将是当前点,但我们希望此前有4个。)
不清楚您所指的关于开始时间的内容,因此该部分已被遗漏。此外,我假设数据已排序(问题中的情况)。您可能还希望将日期转换为"Date"
类,但如果行已经排序,则不需要回答该问题。
library(zoo)
roll <- function(x) rollapply(x, list(-seq(4)), mean, fill = NA)
transform(DF, Average = ave(Values, Name, FUN = roll))
2)或者如果你喜欢dplyr然后使用上面的roll
:
library(dplyr)
library(zoo)
DF %>%
group_by(Name) %>%
mutate(Average = roll(Values)) %>%
ungroup()
答案 1 :(得分:0)
选项是将zoo::rollapply
与dplyr::lag
一起使用:
library(dplyr)
library(lubridate)
library(zoo)
df %>% mutate(DATE = mdy(DATE)) %>% #Convert to Date
arrange(Name, DATE) %>% #Order on Name and DATE
mutate(Avg = rollapply(Values, 4, mean, fill= NA, align = "right")) %>%
mutate(Average = lag(Avg)) %>% # This shows mean for previous 4 rows
select(-Avg)
# Name DATE Values Average
# 1 TestA 2017-03-03 50 NA
# 2 TestA 2017-03-04 75 NA
# 3 TestA 2017-03-05 25 NA
# 4 TestA 2017-03-06 100 NA
# 5 TestA 2017-03-07 100 62.50
# 6 TestA 2017-03-08 50 75.00
# 7 TestA 2017-03-09 80 68.75
# 8 TestA 2017-03-10 90 82.50
# 9 TestA 2017-03-11 25 80.00
# 10 TestA 2017-03-12 0 61.25
# 11 TestA 2017-03-13 0 48.75
# 12 TestA 2017-03-14 0 28.75
# 13 TestA 2017-03-15 0 6.25
# 14 TestA 2017-03-16 50 0.00
数据:强>
df <- read.table(text =
"Name DATE Values
TestA '3/3/2017' 50
TestA '3/4/2017' 75
TestA '3/5/2017' 25
TestA '3/6/2017' 100
TestA '3/7/2017' 100
TestA '3/8/2017' 50
TestA '3/9/2017' 80
TestA '3/10/2017' 90
TestA '3/11/2017' 25
TestA '3/12/2017' 0
TestA '3/13/2017' 0
TestA '3/14/2017' 0
TestA '3/15/2017' 0
TestA '3/16/2017' 50",
header = TRUE, stringsAsFactors = FALSE)