Question

我想每天计算每周回报（使用婚姻约会）。原始数据如下（摘录）：

...
4003  1985-05-06       200.764
4004  1985-05-07       202.502
4005  1985-05-08       202.683
4006  1985-05-09       204.642
4007  1985-05-10       206.051
4008  1985-05-13       207.702
4009  1985-05-14       207.630
4010  1985-05-15       207.585
4011  1985-05-16       207.843
4012  1985-05-17       209.723
4013  1985-05-20       212.843
...

为计算每周回报，我想提取每周三的数据。如果一周不包括星期三，我想在下一个工作日进行摘要。

为了提取星期三的数据，我使用了以下代码：

wednesday = as.POSIXlt(time(data))$wday == 3
indx <- c(0, which(wednesday))
datanew<-period.apply(data, INDEX=indx, FUN=last)

但是使用此代码，如果本周三没有星期三的数据，那么星期三的数据显然不会提取，这意味着周三有假期。

有人可以帮忙吗？

Answer 1

以下是我计算可用作idx的{{1}}参数的向量INDEX的建议：

period.apply

（我很抱歉#------------------------------------------------------------------- # Example data: set.seed(1) N <- 1000 data <- data.frame( t = as.Date("1985-01-01") + (0:(N-1))*as.difftime(1,units="days"), x = sample(100:300,N,replace=TRUE)) weekDay <- function(t=data$t){as.POSIXlt(t)$wday} # Remove Saterdays, Sundays, and 20% of the rest: data <- data[-which(weekDay() %in% c(0,6)),] data <- data[-sample(1:nrow(data),ceiling(0.2*nrow(data))),] #-------------------------------------------------------------------------- # Count the number of weeks: dt <- data$t[nrow(data)] - data$t[1] units(dt) <- "days" weekCnt <- ceiling(as.numeric(dt)/7) #-------------------------------------------------------------------------- # Find all Wednesdays in the interval from data$t[1] to data$t[nrow(data)]: wed.0 <- data$t[1] + ((3-weekDay(data$t[1])) %% 7)*as.difftime(1,units="days") wed <- wed.0 + (0:(weekCnt-1))*as.difftime(7,units="days") #-------------------------------------------------------------------------- # For each week find the the smallest index i such that data$t[i] is not # earlier than the Wednesday of that week: idx <- rep(NA,weekCnt) for ( i in 1:weekCnt ) { idx[i] <- which.max( wed[i] <= data$t ) } #-------------------------------------------------------------------------- # Check the result: X <- cbind( data, weekDay(), week=strftime(data$t,format="%W"), selected=rep(FALSE,nrow(data))) X$selected[idx] <- TRUE - 循环。但它并没有增长，所以这是一个无害的for - 循环。）结果总结在for：

中

Answer 2

您可以使用以下功能执行此操作。评论应该解释它是如何工作的（评论是否有任何不清楚的地方）。

WedOrNext <- function(w) {
  # find all the weekdays after Tuesday and before Saturday
  iwd <- .indexwday(w)
  i <- iwd > 2 & iwd < 6

  # determine which row to return
  if (any(i)) {
    # return the first weekday after Tuesday
    w[i][1L]
  } else {
    # return NA if there are no weekdays after Tuesday
    NA
  }
}

现在，您可以使用split将数据分成每周块，lapply将WedOrNext函数应用于每个块，然后do.call(rbind, ...)将所有块重新组合在一起。使用您提供的数据，下面的代码演示了如果星期二之后没有数据会发生什么。

# run on your original object
x <- structure(c(200.764, 202.502, 202.683, 204.642, 206.051, 207.702, 207.63,
  207.585, 207.843, 209.723, 212.843), .Dim = c(11L, 1L),
  index = structure(c(484185600, 484272000, 484358400, 484444800, 484531200,
  484790400, 484876800, 484963200, 485049600, 485136000, 485395200),
  tzone = "UTC", tclass = "Date"), class = c("xts", "zoo"),
  .indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC")
# The last week has no data after Tuesday, so it's not represented
do.call(rbind, lapply(split(x, "weeks"), WedOrNext))
#               [,1]
# 1985-05-08 202.683
# 1985-05-15 207.585

如果星期三没有数据，它将如何执行。

y <- structure(c(200.764, 202.502, 202.683, 204.642, 206.051, 207.702, 207.63,
  207.585, 207.843, 209.723, 212.843, 200, 201, 202), class = c("xts", "zoo"),
  .Dim = c(14L, 1L), index = structure(c(484185600, 484272000, 484358400,
  484444800, 484531200, 484790400, 484876800, 484963200, 485049600, 485136000,
  485395200, 485481600, 485654400, 485740800), tzone = "UTC", tclass = "Date"),
  .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC")
do.call(rbind, lapply(split(y, "weeks"), WedOrNext))
#               [,1]
# 1985-05-08 202.683
# 1985-05-15 207.585
# 1985-05-23 201.000

计算每日价格的每周回报

2 个答案: