使用前向回归和/或LARS从插入符号中的R列车函数中检索系数

时间:2017-04-08 00:01:50

标签: r linear-regression r-caret coefficients lars

我在R工作,并使用几种方法探索使用插入符号进行变量选择和加权。在这里,我正在探索使用前向逐步和最小角度回归(LARS),使用每个的调整参数。在下面的代码中,我任意选择了因变量(​​y)和预测变量子集(x's),并使用70%的数据子集通过训练算法运行它们。为此,我正在应用重复的10倍交叉验证。 我正在努力的是定位命令以识别从列车功能导出的最终模型参数(例如,截距,β权重)。当我调用object $ finalModel时,我不是很容易看到它。有没有办法使用列出的方法(前向逐步回归和LARS)在R中恢复这些?我觉得这必须存在......

谢谢!

library (caret)
library(AppliedPredictiveModeling)
data(abalone)
str(abalone)

set.seed(18)
inTrain <- sample(1:(round(nrow(abalone)*.7)),replace=FALSE)

train_df <- abalone [inTrain,]
test_df <- abalone [-inTrain,]

#predicting Diameter using several of the dataset's variables#
train_df_x <- train_df [,4:8]
test_df_x <- test_df [,4:8]
y_train <- train_df [,3]
y_test <- test_df  [,3]

set.seed(18)
fold.ids <- createMultiFolds(y_train,k=10,times=3)
fitControl <- trainControl(method = "repeatedcv",
                           number = 10,
                           repeats = 3,
                           returnResamp = "final",
                           index = fold.ids,
                           summaryFunction = defaultSummary,
                           selectionFunction = "oneSE")

### Forward regression ###
library(leaps)
forwardLmGrid <- expand.grid (.nvmax=seq(2,5))
set.seed(18)
F_OLS_fit <- train(train_df_x, y_train,"leapForward",trControl = fitControl,metric="RMSE", tuneGrid=forwardLmGrid)

### LARS ###
larGrid <- expand.grid(.fraction=seq(.01,.99,length=50))
library(lars)
Lar_fit <- train(train_df_x, y_train,"lars",trControl = fitControl,metric="RMSE", tuneGrid=larGrid)

1 个答案:

答案 0 :(得分:0)

我将通过一个例子告诉你我是如何做到的:

library(data.table)
n <- 1000
x1 <- runif(n,min=-10,max=10)
x2 <- runif(n,min=-10,max=10)
x3 <- runif(n,min=-10,max=10)
x4 <- runif(n,min=-10,max=10)
x5 <- runif(n,min=-10,max=10)
y1 <- 30 + x1 + 4*x2 + x3
synthetic <- data.table(x1=x1,x2=x2,x3=x3,x4=x4,x5=x5,y=y1)
library(caret)
library(lars)
ctrl <- trainControl(method = "cv", savePred=T, number=3)
fractionGrid <- expand.grid (fraction=seq(0,1,(1/(ncol(widedt)-1))))
cvresult <- train(y~.,
                  data=synthetic,
                  method = "lars",
                  trControl = ctrl,
                  metric="RMSE",
                  tuneGrid=fractionGrid,
                  use.Gram=FALSE)
coeffs <- predict.lars(cvresult$finalModel,type="coefficients")
models <- as.data.table(coeffs$coefficients)
winnermodelscoeffs <- models[which(coeffs$fraction==cvresult$bestTune$fraction)]