我有一个过去12小时内加权交易总和的工作示例。现在,我添加了一个帐户列,并希望按组分别计算此加权金额。代码将按如下所示运行。取消注释以# account
开头的行,将account
列添加到df
。 如何修改代码的倒数第二行,以便在每个rollapplyr
上单独计算account
?
library(zoo)
library(tidyverse)
创建示例数据:
set.seed(123)
randomDates <- function(N, st="2017-01-01 00:00:00", et="2017-02-01 23:59:59") {
st <- as.POSIXct(st, tz = "UTC")
et <- as.POSIXct(et, tz = "UTC")
dt <- as.numeric(difftime(et,st,units="sec", tz="UTC"))
ev <- sort(runif(N, 0, dt))
rt <- st + ev
rt
}
df <- data.frame(date = randomDates(100) ,
data = round( abs(rnorm(100)) * 100 ) # ,
# account = sample(c("A", "B", "C"), 100, replace=TRUE )
)
df <- df %>% arrange(date)
定义辅助函数:
tau <- 0.00005
decay = function(tau, day){
exp(-tau * day)
}
weighted <- function(x, tau) {
tx <- as.numeric(time(x))
seconds <- tail(tx, 1) - tx
w <- (seconds < 43200) * decay(tau, seconds) # 12 hours in seconds
sum(w * coredata(x))
}
计算滚动总和:
# Would like to modify this block to group by account
newData <- df %>%
read.zoo %>%
rollapplyr(43200, weighted, tau = tau, partial = TRUE, coredata = FALSE)
dfNew <- df %>% mutate( weighted_sum = newData )
date data weighted_sum
1 2017-01-01 00:21:26 38 38.000000
2 2017-01-01 21:29:53 56 56.000000
3 2017-01-02 14:02:43 34 34.000000
4 2017-01-02 20:41:28 9 19.279179
5 2017-01-03 06:08:07 160 161.644215
根据我的研究,我还没有找到答案:
Apply a rolling sum by group in R
use rollapply and zoo to calculate rolling average of a column of variables
https://www.rdocumentation.org/packages/zoo/versions/1.8-1/topics/rollapply
我也根据对此问题的反馈以及链接的,可能重复的答案尝试了此解决方案。但是,应用相同的模式会导致我无法解决的错误:
newData <- df %>%
group_by(account) %>%
mutate(weighted_sum = rollapplyr(., width=43200, FUN = weighted,
tau = tau, partial = TRUE, coredata = FALSE) ) %>%
ungroup()
引发此错误:
# Error in mutate_impl(.data, dots) :
Evaluation error: non-numeric argument to binary operator.