tseries - 阻止bootstrap两个系列相同的重采样顺序

时间:2017-12-28 14:16:03

标签: r time-series bootstrapping

例如

    require(tseries)
    series1 <- c(100,140,150,200,150,260,267,280,300,350)
    series2 <- c(500,600,250,300,350,500,100,130,50,60)
    data <- data.frame("series1" = series1, "series2" = series2)
ts  = tsbootstrap(data$series1, m=1, b=2, type="block", nb=10)
ts <- as.data.frame(ts)
head(ts)

> head(ts)
   V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
1 280 280 150 200 100 300 150 140 100 260
2 300 300 200 150 140 350 260 150 140 267
3 140 260 140 260 267 200 150 150 260 300
4 150 267 150 267 280 150 200 200 267 350
5 260 100 260 150 300 100 150 267 100 200
6 267 140 267 200 350 140 260 280 140 150

我们现在有两个块,并以不同的顺序拼凑在一起。我的问题是,如何通过块boostrap'重新'洗牌'series1和series2,同时保持两个系列的块在同一顺序?

例如..如果我们将块设置为2,则抓取2个块,让它说出其中5,6的位置。它抓住元素5,6并将其移动到位置1,2 ...这是针对series1 ,对于系列2,它抓住元素5,6并移动到位置1,2。这样我保持两个系列的顺序,这可能吗?

到目前为止,我尝试合并series1和series2来创建一个新列。这样当使用bootstrap时它会将两个系列移动到相同的位置:

    data <- transform(data, ts.merge=paste(series1, series2, sep=","))
head(data)
  series1 series2 ts.merge
1     100     500  100,500
2     140     600  140,600
3     150     250  150,250
4     200     300  200,300
5     150     350  150,350
6     260     500  260,500

但是,分隔符与tseries不兼容......

Error in FUN(newX[, i], ...) : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: In as.vector(x, mode = "double") : NAs introduced by coercion
2: In as.vector(x, mode = "double") : NAs introduced by coercion

我也尝试使用分隔符“”但是,不知道我之后如何区分两个数值以便将它们分开(注意我的真实例子不仅仅是如上所示的三位数值,否则我可以将它们分成两半事后)

3 个答案:

答案 0 :(得分:0)

如果目标是使series1和series2行保持同步,则可以在创建&#39;数据时添加索引。如下:

data <- data.frame("series1" = series1, "series2" = series2, index = 
seq(1:length(series1))) 

然后将数据字段更改为bootstrap to&#39; index&#39;如下:

ts  = tsbootstrap(data$index, m=1, b=2, type="block", nb=10)

答案 1 :(得分:0)

尝试:

ts.index    = tsbootstrap(index(series1), m=1, b=2, type="block", nb=10)
series1[ts.index[,1]]
series2[ts.index[,1]]

接下来,您可以根据需要管理最终的数据框。

答案 2 :(得分:0)

整天采访我,但这是一个手动解决方案,每行将重新采样:

    # Random Data
    data=matrix(rnorm(20*100), ncol = 2)
    data=as.data.frame(data)
    # Set block size
    reps <- NROW(data)/5 # Set group number
    data$id <- rep(1:reps,each=5) # each = 5 corresponds to number of blocks to bootstrap by (5 in this case)
    # Id data
    IDs<-unique(data$id)
    runs <- 1:1000
    temp <- list()
    # Function for bootstrap 1x data frame
    # subsets data by id number
    # Resamples the subsets
    bootSTRAP = function(x){
      for (i in 1:length(IDs)){ 
        temp[i] <- list(data[data$id==IDs[i],])
      }
      out <- sample(temp,replace=TRUE)
      df <- do.call(rbind, out)
    }

    # Loop for running it a 1000 times
    runs <- 1:1000
    run.output <- list()
    i=1
    for (i in 1:length(runs)){    # Length of optimization
      tryCatch({
        temp.1 <- bootSTRAP(runs[i])
        #cum_ret <- rbind.data.frame(cum_ret, temp)
        run.output[[i]] <- cbind.data.frame(temp.1)
        ptm0 <- proc.time()
        Sys.sleep(0.1)  
        ptm1=proc.time() - ptm0
        time=as.numeric(ptm1[3])
        cat('\n','Iteration',i,'took', time, "seconds to complete")
      }, error = function(e) { print(paste("i =", i, "failed:")) })
    }

# cbind outputs
master <- do.call(cbind, run.output)
# Rename columns 
col.ids <- rep(1:1000,each=3)
cnames   <- paste(col.ids)
colnames(master) <- cnames