Question

在一个相当大的数据框架中，我必须选择一些随机行来执行一个函数。在我的例子中，我使用的第一个函数是方差，然后一个函数关闭到我在脚本中使用的真实函数，在f之后调用。我没有详细说明f的目的，但它涉及截断的高斯分布和最大似然估计。

我的问题是我的代码对于第二个函数来说太慢了，我想对for loop或sample函数进行一些优化可以帮助我。

以下是代码：

df <- as.data.frame(matrix(0,2e+6,2))
df$V1 <- runif(nrow(df),0,1)
df$V2 <- sample(c(1:10),nrow(df), replace=TRUE)

nb.perm <- 100 # number of permutations
res <- c()
for(i in 1:nb.perm) res <- rbind(res,tapply(df[sample(1:nrow(df)),"V1"],df$V2,var))

library(truncnorm)
f <- function(d) # d is a vector
{

f2 <- function(x) -sum(log(dtruncnorm(d, a=0, b=1, mean = x[1], sd = x[2])))
  res <- optim(par=c(mean(d),sd(d)),fn=f2)
  if(res$convergence!=0) warning("Optimization has not converged")
  return(list(res1=res$par[1],res2=res$par[2]^2))
}

for(i in 1:nb.perm) res2 <- rbind(res,tapply(df[sample(1:nrow(df)),"V1"],df$V2,function(x) f(x)$res2))

我希望我足够清楚。

加快R中的样本功能

0 个答案: