使用R设置优化过程时遇到一些问题。我的数据集如下:
set.seed(123)
library(lpSolve)
num_data <- 1000
bal_max <- .2/100
ind_max <- 10.5/100
data <- data.frame(id = 1:num_data,
balance = pmax(0,runif(num_data, 0, 1000)),
industry = rep(seq(1:10),num_data/10))
data$risk <- pmax(0, data$balance + rnorm(num_data,100,10))
如您所见,有1000个ids,10个不同的行业。目标是最大化列&#34;风险&#34;的总和。同时确保每个贷款和行业的比例分别不高于2%和10.5%。
在当前数据集中,不满足这些条件:
max(data$balance) / sum(data$balance)
#[1] 0.002009751
industry <- aggregate(balance ~ industry, FUN=sum,data=data)
max(industry$balance) / sum(industry$balance)
#[1] 0.1093997
因此,在我们最大化柱风险之前,需要满足这两个条件。我的代码的其余部分如下
# set up linear prog problem
num_x <- nrow(data)
num_ind <- length(unique(data$industry))
objective.in <- data$risk
# define quantity to be maximized
# construct right-hand-side of constraint vector
# - sum of balances = 1
# - each balance <= bal_max
# - sum of balances for each industry <= ind_max
# - lp solver function imposes constraint that each balance >= 0
const.rhs <- c( 1, rep(bal_max, num_x), rep(ind_max, num_ind))
# construct constraint matrix for same constraints
mat_ind <- matrix(0,nrow=num_ind, ncol=num_x)
for( i in 1:num_ind) mat_ind[i,which(data$industry == i)] <- 1
const.mat <- rbind( matrix(1, nrow=1,ncol=num_x), diag(num_x), mat_ind )
# define directions for each constraint equation
const.dir <- c("=", rep("<=",num_x), rep("<=", num_ind))
# find balances for max risk
#
max_risk <- lp(direction="max", objective.in=objective.in, const.mat=const.mat,
const.dir=const.dir, const.rhs=const.rhs)
max_risk
# add data balances with optimum solution
data$balance <- max_risk$solution
# each balance should be smaller than bal_max
max(data$balance)
# industry should be smaller than 10.5% each
industry <- aggregate(balance ~ industry, FUN=sum,data=data)
industry
正如您所看到的,每个行业和个人ID都不再超过bounderies(10.5%和2%)。问题是这个代码要么为每个loanid填充2%或0%(这样总和为1)。但是,初始余额的绝对值也不应该上升。在这个例子中,原始余额经常上升(填写2%)。
简而言之,我想优化专栏&#34;风险&#34;其中每个个人ID 上限占总余额的2%(在优化过程之后),并且每个行业的上限为10.5%(在优化过程之后)平衡。所有分数的总和应为1,余额的绝对值不能增加。
我们的想法是减少余额,以便满足所有条件并且&#34;风险&#34;优化。