我想每天计算每周回报(使用婚姻约会)。原始数据如下(摘录):
...
4003 1985-05-06 200.764
4004 1985-05-07 202.502
4005 1985-05-08 202.683
4006 1985-05-09 204.642
4007 1985-05-10 206.051
4008 1985-05-13 207.702
4009 1985-05-14 207.630
4010 1985-05-15 207.585
4011 1985-05-16 207.843
4012 1985-05-17 209.723
4013 1985-05-20 212.843
...
为计算每周回报,我想提取每周三的数据。如果一周不包括星期三,我想在下一个工作日进行摘要。
为了提取星期三的数据,我使用了以下代码:
wednesday = as.POSIXlt(time(data))$wday == 3
indx <- c(0, which(wednesday))
datanew<-period.apply(data, INDEX=indx, FUN=last)
但是使用此代码,如果本周三没有星期三的数据,那么星期三的数据显然不会提取,这意味着周三有假期。
有人可以帮忙吗?
答案 0 :(得分:0)
以下是我计算可用作idx
的{{1}}参数的向量INDEX
的建议:
period.apply
(我很抱歉#-------------------------------------------------------------------
# Example data:
set.seed(1)
N <- 1000
data <- data.frame( t = as.Date("1985-01-01") + (0:(N-1))*as.difftime(1,units="days"),
x = sample(100:300,N,replace=TRUE))
weekDay <- function(t=data$t){as.POSIXlt(t)$wday}
# Remove Saterdays, Sundays, and 20% of the rest:
data <- data[-which(weekDay() %in% c(0,6)),]
data <- data[-sample(1:nrow(data),ceiling(0.2*nrow(data))),]
#--------------------------------------------------------------------------
# Count the number of weeks:
dt <- data$t[nrow(data)] - data$t[1]
units(dt) <- "days"
weekCnt <- ceiling(as.numeric(dt)/7)
#--------------------------------------------------------------------------
# Find all Wednesdays in the interval from data$t[1] to data$t[nrow(data)]:
wed.0 <- data$t[1] + ((3-weekDay(data$t[1])) %% 7)*as.difftime(1,units="days")
wed <- wed.0 + (0:(weekCnt-1))*as.difftime(7,units="days")
#--------------------------------------------------------------------------
# For each week find the the smallest index i such that data$t[i] is not
# earlier than the Wednesday of that week:
idx <- rep(NA,weekCnt)
for ( i in 1:weekCnt ) { idx[i] <- which.max( wed[i] <= data$t ) }
#--------------------------------------------------------------------------
# Check the result:
X <- cbind( data, weekDay(), week=strftime(data$t,format="%W"), selected=rep(FALSE,nrow(data)))
X$selected[idx] <- TRUE
- 循环。但它并没有增长,所以这是一个无害的for
- 循环。)
结果总结在for
:
X
答案 1 :(得分:0)
您可以使用以下功能执行此操作。评论应该解释它是如何工作的(评论是否有任何不清楚的地方)。
WedOrNext <- function(w) {
# find all the weekdays after Tuesday and before Saturday
iwd <- .indexwday(w)
i <- iwd > 2 & iwd < 6
# determine which row to return
if (any(i)) {
# return the first weekday after Tuesday
w[i][1L]
} else {
# return NA if there are no weekdays after Tuesday
NA
}
}
现在,您可以使用split
将数据分成每周块,lapply
将WedOrNext
函数应用于每个块,然后do.call(rbind, ...)
将所有块重新组合在一起。
使用您提供的数据,下面的代码演示了如果星期二之后没有数据会发生什么。
# run on your original object
x <- structure(c(200.764, 202.502, 202.683, 204.642, 206.051, 207.702, 207.63,
207.585, 207.843, 209.723, 212.843), .Dim = c(11L, 1L),
index = structure(c(484185600, 484272000, 484358400, 484444800, 484531200,
484790400, 484876800, 484963200, 485049600, 485136000, 485395200),
tzone = "UTC", tclass = "Date"), class = c("xts", "zoo"),
.indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC")
# The last week has no data after Tuesday, so it's not represented
do.call(rbind, lapply(split(x, "weeks"), WedOrNext))
# [,1]
# 1985-05-08 202.683
# 1985-05-15 207.585
如果星期三没有数据,它将如何执行。
y <- structure(c(200.764, 202.502, 202.683, 204.642, 206.051, 207.702, 207.63,
207.585, 207.843, 209.723, 212.843, 200, 201, 202), class = c("xts", "zoo"),
.Dim = c(14L, 1L), index = structure(c(484185600, 484272000, 484358400,
484444800, 484531200, 484790400, 484876800, 484963200, 485049600, 485136000,
485395200, 485481600, 485654400, 485740800), tzone = "UTC", tclass = "Date"),
.indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC")
do.call(rbind, lapply(split(y, "weeks"), WedOrNext))
# [,1]
# 1985-05-08 202.683
# 1985-05-15 207.585
# 1985-05-23 201.000