如何保存每次优化迭代的系数?

时间:2018-09-28 09:20:38

标签: r

此问题与Applying Cost Functions in R

有关

我想知道如何保存为optim的每次迭代生成的系数。 trace=TRUE使我能够打印出每次迭代的系数,但是如何保存它们

示例代码:

set.seed(1)
X <- matrix(rnorm(1000), ncol=10) # some random data
Y <- sample(0:1, 100, replace=TRUE)

# Implement Sigmoid function
sigmoid <- function(z) {
  g <- 1/(1+exp(-z))
  return(g)
}

cost.glm <- function(theta,X) {
  m <- nrow(X)
  g <- sigmoid(X%*%theta)
  (1/m)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
}

X1 <- cbind(1, X)

df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
            X=X1, control=list(trace=TRUE))

哪个输出:

 Conjugate gradients function minimizer
Method: Fletcher Reeves
tolerance used in gradient test=2.00089e-11
0 1 0.693147
parameters    0.00000    0.00000    0.00000    0.00000    0.00000    0.00000    0.00000 
   0.00000    0.00000    0.00000    0.00000 
 i> 1 3 0.662066
parameters   -0.01000   -0.01601   -0.06087    0.14891    0.04123    0.03835   -0.01898 
   0.00637    0.02954   -0.01423   -0.07544 
 i> 2 5 0.638548
parameters   -0.02366   -0.03733   -0.13803    0.32782    0.09034    0.08082   -0.03978 
   0.01226    0.07120   -0.02925   -0.16042 
 i> 3 7 0.630501
parameters   -0.03478   -0.05371   -0.19149    0.43890    0.11960    0.10236   -0.04935 
   0.01319    0.10648   -0.03565   -0.20408 
 i> 4 9 0.627570.......

df不包含有关系数的任何信息,而仅显示最终系数和最终成本:

 str(df)
List of 5
 $ par        : num [1:11] -0.0679 -0.1024 -0.2951 0.6162 0.124 ...
 $ value      : num 0.626
 $ counts     : Named int [1:2] 53 28
  ..- attr(*, "names")= chr [1:2] "function" "gradient"
 $ convergence: int 0
 $ message    : NULL

2 个答案:

答案 0 :(得分:5)

## use `capture.output` to get raw output
out <- capture.output(df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
                                  X=X1, control=list(trace=TRUE)))
## lines that contain parameters
start <- grep("parameters", out)
param_line <- outer(seq_len(start[2] - start[1] - 1) - 1, start, "+")
## parameter message
param_msg <- gsub("parameters", "", out[param_line])
## parameter matrix (a row per iteration)
param <- matrix(scan(text = param_msg), ncol = length(df$par), byrow = TRUE)

## inspect output (rounded to 2-digits for compact display)

head(round(param, 2))
#       [,1]  [,2]  [,3] [,4] [,5] [,6]  [,7]  [,8] [,9] [,10] [,11]
# [1,]  0.00  0.00  0.00 0.00 0.00 0.00  0.00  0.00 0.00  0.00  0.00
# [2,] -0.01 -0.02 -0.06 0.15 0.04 0.04 -0.02  0.01 0.03 -0.01 -0.08
# [3,] -0.02 -0.04 -0.14 0.33 0.09 0.08 -0.04  0.01 0.07 -0.03 -0.16
# [4,] -0.03 -0.05 -0.19 0.44 0.12 0.10 -0.05  0.01 0.11 -0.04 -0.20
# [5,] -0.04 -0.07 -0.23 0.51 0.14 0.11 -0.05  0.01 0.14 -0.04 -0.22
# [6,] -0.05 -0.08 -0.25 0.55 0.14 0.12 -0.05  0.01 0.16 -0.04 -0.23

tail(round(param, 2))
#[23,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[24,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[25,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[26,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[27,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21
#[28,] -0.07 -0.10 -0.30 0.62 0.12 0.13 -0.03 -0.01 0.21 -0.04 -0.21

## one way to visualize the search steps
matplot(param, type = "l", lty = 1, xlab = "iterations")

答案 1 :(得分:2)

因此,其他解决方案也可以...但是涉及解析跟踪内部的响应。 这是一种使您可以直接访问对象的方法。 (这在不允许您轻松显示文本跟踪的任何其他优化功能中很常见):

(之所以起作用,是因为您可以在函数内部分配环境内部)

编辑:每次运行cost.glm时,不仅在每次评估跟踪时,都会添加另一行。

还将转换添加到其他解决方案使用的矩阵格式。

set.seed(1)
X <- matrix(rnorm(1000), ncol=10) # some random data
Y <- sample(0:1, 100, replace=TRUE)


# Implement Sigmoid function
sigmoid <- function(z) {
  g <- 1/(1+exp(-z))
  return(g)
}

# Create environment to store output
# We could also use .GlobalEnv
params_env <- new.env()

# Initialize parameters object
params_env$optim_run <- list()

cost.glm <- function(theta,X) {
  # Extend the list by 1 and insert theta inside the given environment
  # This can be done more efficiently by
  # extending several at a time, but that's easy to add.
  n <- length(params_env[['optim_run']])
  params_env[['optim_run']][[n + 1]] <- theta

  m <- nrow(X)
  g <- sigmoid(X%*%theta)
  (1/m)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
}

X1 <- cbind(1, X)

df <- optim(par=rep(0,ncol(X1)), fn = cost.glm, method='CG',
            X=X1, control=list(trace=TRUE))

# View list of all param values
print(params_env$optim_run)

# Return as same format as other solution
param <- do.call(rbind, params_env[['optim_run']])

matplot(param, type = "l", lty = 1, xlab = "iterations")

As we can see, there are more jumps in this set as cost.glm explores the space