Question

我正在尝试适应数百万个模型，然后在样本预测之外测量它们的性能（选择最低的RMSE）。

我最初在lm（）的几个模型上使用了lapply（），然后会预测来自predict.lm（）的样本观察结果，但它太慢了。

我在其他线程上遇到lm.fit（）更快，而且确实如此。但我现在不确定使用系数进行预测的最佳方法。

我也尝试过speedglm软件包，但它并没有为我工作。

请参阅下面的代码。

#findcombo gets every possible combination of variables and places them in a 
list of character vectors
findcombo <- function(x){
  do.call("c", lapply(seq_along(x), function(i) combn(x, i, FUN = list)))
}
data %>% select(2:ncol(data)) %>% colnames() %>% findcombo()

modelframe2 <- tibble(combos) #puts list as a column in a tibble

#dependent variable is labeled as dependent, function below creates dataframes with omitted NAs to be used for lm.fit 
modelframe2 <- modelframe2 %>% 
mutate(indep = map(combos, function(x){data %>% select(dependent, one_of(x)) %>% 
na.omit()} ))

#reg performs lm.fit on the matrices produced by the function above and generates a table of coefficients
reg <- function(mat){
return((lm.fit(cbind(1,as.matrix(mat[,2:ncol(mat)])), mat[,1]))$coefficients)
}

#applies reg across each possible combination of variables
modelframe2 <- modelframe2 %>% mutate(models = map(indep, reg))

R＆lt; lm（）和predict.lm（）太慢了。寻找使用lm.fit产生的回归系数的快速预测

0 个答案: