R中基于Log-Cosh损失函数的线性回归模型

时间:2019-11-28 12:48:50

标签: r machine-learning regression

我已经从理论上了解了损失函数,以及如何在R中基于损失函数建立回归模型。

除Log-Cosh损失函数外,我可以在R编程中基于不同的损失函数应用所有回归模型。

例如,我想在DATA的5倍子集上建立线性回归模型,然后提取系数并按如下方式计算个体和汇总方差。

data = read.csv("train.csv") # "critical_temp" is the dependent variable. 
data_nom_df=as.data.frame(scale(data))#Normalization   
#Cross Validation
set.seed(12345)
k = 5 
folds <- createFolds(data_nom_df$critical_temp, k = k, list = TRUE, returnTrain = TRUE)
## Ordinary Least Square regression

#block A
lm = list()
for (i in 1:k) {
  lm[[i]] = lm(critical_temp~ ., 
               data = data_nom_df[folds[[i]],])
}

#block B
lm_coef = list()
lm_coef_var = list()
for(j in 1:(lm[[1]]$coefficients %>% length())){
  for(i in 1:k){
    lm_coef[[i]] = lm[[i]]$coefficients[j] 
    lm_coef_var[[j]] = lm_coef %>% unlist() %>% var()
  } 
}

#block C
lm_var = unlist(lm_coef_var)
lm_df = cbind(coefficients = lm[[1]]$coefficients %>% names() %>% as.data.frame()
              , variance = lm_var %>% as.data.frame()) 
colnames(lm_df) = c("coefficients", "variance_lm")
lm_df

#block D
lm_var_sum = sum(lm_var)
lm_var_sum

其余回归模型相同。但是,我找不到任何代码或程序包来应用基于R中的Log-Cosh损失函数的回归模型。

您能否指导我找到任何可以帮助我解决此问题的资源。

1 个答案:

答案 0 :(得分:3)

这可以从第一原则开始。另外请注意,如果您遇到数字难题,那么limma软件包中将存在logcosh函数,该函数可以代替log(cosh(.))

f <- function(b) with(cars, sum(log(cosh(dist - b[1] - b[2] * speed))))
fm0 <- lm(dist ~ speed, cars)
res <- optim(coef(fm0), f, method = "BFGS")
str(res)
## List of 5
##  $ par        : Named num [1:2] -12.82 3.47
##   ..- attr(*, "names")= chr [1:2] "(Intercept)" "speed"
##  $ value      : num 532
##  $ counts     : Named int [1:2] 28 10
##   ..- attr(*, "names")= chr [1:2] "function" "gradient"
##  $ convergence: int 0
##  $ message    : NULL

图形

# the black line is the ordinary least squares regression line and 
# the red line is the log cosh regression line
plot(cars)
abline(fm0)
yfit <- res$par[1] + res$par[2] * cars$speed
lines(cars$speed, yfit, col = "red")

screenshot

添加

请注意,优化也可以这样编写,如果您有许多独立变量,这可能会很有用。

fm0 <- lm(dist ~ speed, cars)
X <- model.matrix(fm0)
f <- function(b) with(cars, sum(log(cosh(dist - X %*% b))))
res <- optim(coef(fm0), f, method = "BFGS")
res

给予:

$par
(Intercept)       speed 
 -12.816190    3.469536 

$value
[1] 531.5872

$counts
function gradient 
      28       10 

$convergence
[1] 0

$message
NULL