R,提高分拣速度

时间:2016-10-10 15:45:04

标签: r algorithm performance sorting

我正在模拟~30个相关的随机变量,让我们说500K次。然后我将所有随机变量求和以得到我想要的结果。我想找出这笔钱的99.5%百分位数,这很容易完成。另外,我想让所有相邻场景在随机变量级别上达到99.5%的实现。也就是说,对于99.499%-percentile,我想知道每个随机变量的不同实现是什么。下面是执行此操作的代码的程式化示例。然而,事实证明,"顺序"功能需要相当长的时间。有关如何最小化下面的代码运行时的任何建议?在这个特定的例子中,它似乎不是一个问题,但我需要做上述约500次。

require("MASS")

  #Parameters to run
  n=30
  N_simulations=500000
  Percentile=0.995
  window_size=101

  #Preparing multivariate normal parameters
  cov_matrix<-matrix(0.5,n,n)
  diag(cov_matrix)<-1
  mu<-rep(0,30)


  #running the simulation
  simulation_matrix<-t(mvrnorm(n=N_simulations, mu=mu, Sigma= cov_matrix)) #RVs as rows and simulations as columns
  losses<-colSums(simulation_matrix)
  VaR<-quantile(losses,probs=0.995)

  #Method 1 - using the order function to get adjacent scenarios
  unsorted_losses<-cbind(seq(1:N_simulations),losses)
  sorted_losses<-unsorted_losses[order(unsorted_losses[,2],decreasing=FALSE),]

  sorted_scenario<-N_simulations*Percentile 
  window<-seq(-(window_size-1)/2,(window_size-1)/2,1)
  window<-sorted_scenario+window 
  scenarios<-as.vector(sorted_losses[window,1])  # gives all the adjacent scenarios


   #Method 2 - using partial sort function (does not work properly when duplicate values)
  names(unsorted_losses)<-seq(1:N_simulations)
  z=2*(1-Percentile)*N_simulations
  a<-sort(-unsorted_losses,partial=1:z)
  sorted_losses<- - a[1:z] #highest loss to smallest loss

  u<-names(unsorted_losses[match(sorted_losses,unsorted_losses)]) #getting the rownames in unsorted losses
  names(sorted_losses)<-u

  sorted_scenario<-N_simulations-N_simulations*Percentile
  window<-seq(-(window_size-1)/2,(window_size-1)/2,1)
  window<-sorted_scenario+window #gives the r
  scenarios<-as.numeric(names(sorted_losses[window]))  # gives all the adjacent scenarios from the rownames

# code proposed by Stackoverflow users
  window_size<-1001
  percentile_step<-1/N_simulations
  window<-seq(-(window_size-1)/2,(window_size-1)/2,1)
  percentiles<-window*percentile_step*Percentile+Percentile
  window_losses<-quantile(losses,probs=percentiles,type=3)

  a<-losses[losses %in% window_losses] # returns only 997 TRUE, not 1001!!

0 个答案:

没有答案