Question

anyone knows how can i remove the intercept of my model while performing cross-validation? I have already tried using "-1" or "+0" in my formula but nothing. As you can see on my script my R2 is too low and that is probably because of my intercept. In my case i am allowed to remove it since I am modeling Log removal of pathogens so if there is no input(on my x axis) i cannot have an output(on my y axis). Thank you in advance for any help!!

##Repeated K-fold cross-validation approach##
Data_1<- read_excel("Data.xlsx", sheet = "3_factors_FWS")
Model_1<-lm(Y1~X1+X2+X3,Data_1)`

## Define training control

set.seed(123)
train.control<-trainControl(method = "repeatedcv", number = 10,repeats = 3)

## Train the model

Data_1_CV<-train(Y1~X1+X2+X3,data=Data_1,method="lm", trControl=train.control)

## Summarize the results

print(Data_1_CV)

> Linear Regression 
107 samples
  3 predictor
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 96, 96, 95, 95, 95, 97, ... 
Resampling results:

  RMSE       Rsquared   MAE     
  0.9669935  0.3805181  0.769322

Tuning parameter 'intercept' was held constant at a value of TRUE

summary(Data_1_CV)

> Call:
lm(formula = .outcome ~ ., data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.0995 -0.6317 -0.1058  0.5785  2.4192 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  0.28819    0.26532   1.086  0.27992    
X1                          0.24990    0.04306   5.803 7.24e-08 ***
X2                         -0.26891    0.07976  -3.371  0.00105 ** 
X3                          0.16527    0.06283   2.630  0.00984 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9694 on 103 degrees of freedom
Multiple R-squared:  0.3525,    Adjusted R-squared:  0.3337 
F-statistic: 18.69 on 3 and 103 DF,  p-value: 9.334e-10

How to exclude the intercept from cross-validation

0 个答案: