我有一个很大的两步优化问题,我试图对此问题进行简化。第一步是选择10个元素,以在某些约束条件下最大化效用。我需要这些集合中的200个,但是由于我要执行的操作的性质,需要生成600个集合,以便可以显示正确的组合。
围绕这些最小优化问题是一个较大的约束,其中每个单独的元素只能在一定范围内使用。第一个优化调整了每个元素的效用,以使每个元素都相对接近边界,但是不可能所有元素都在其边界内。因此,第二步是从600套中选择200套,以便满足每个单独元素的最小/最大使用量。这就是我需要的帮助。
我使用lpSolve
制作了一个功能,但是在超过80%的时间里它冻结了RStudio,这变得太麻烦了-我需要改进当前的方法,或者需要全新的方法方法。我不知道lpSolve
是否真的是最好的开始。虽然我确实有一个可以最大化的整体设定得分,但我真正关心的是将每个元素都包含在范围之内。我做了一个简化的例子来弄清问题的本质。
我负责用80种不同的水果制作200顿饭。每顿饭使用10个水果,同一水果不能超过1个。我的水果数量有限(老板让我最少使用每种水果,否则它们会变质),因此它们必须在一定范围内。我已经列出了600顿饭(Meals
),每顿饭都有自己独特的Health-Score。理想情况下,我希望最大程度地提高“健康评分”,但是最重要的一点是,每个水果的使用次数均正确,否则,一开始就无法进餐。
这是我的代码:1)设置600餐(随机)2)设置每个水果必须使用的最小/最大时间(随机)3)运行线性优化以从600餐中选择200餐,以便单个水果约束得到满足。该程序尝试从600中选择200,但是如果约束不允许,则放松约束(例如,如果求解器第一次不工作,那么我将减少Apple的最小次数可以使用,并增加可以使用的最大次数)。它一次完成一个结果,而不是一次完成。最终,应该放宽约束,以至于600个中的任何200个都可以工作(即,当所有水果minPercent小于0且所有水果maxPercent大于100时),但这并不重要,因为R冻结了。
library(stringr)
library(dplyr)
library(lpSolve)
# Inputs
MealsNeeded <- 200
Buffer <- 3
# Setup the meals (this is the output of another optimizer in my actual program. Considered "Step 1" as I mentioned above)
Meals <- data.frame()
for(i in 1:(MealsNeeded*Buffer)){
run <- i
meal <- sample(fruit, 10)
healthFactor <- round(runif(1, 10, 30), 0) #(Health factor for the entire meal)
df <- data.frame(Run = run, Fruit = meal, healthFactor = healthFactor, stringsAsFactors = FALSE)
Meals <- rbind(Meals, df)
}
# The minimum/maximum number of times each fruit must be used across all 200 meals (these would be inputs in my program)
set.seed(11)
fruitDF <- data.frame(Name = fruit, minSelectPct = round(runif(length(fruit), .05, .1)*100, 0), stringsAsFactors = FALSE) %>%
mutate(maxSelectPct = round(minSelectPct/2 + runif(length(fruit), .05, .1)*100, 0))
#### Actual Program Start
# Get objective
obj <- Meals %>%
distinct(Run, healthFactor) %>%
ungroup() %>%
select(healthFactor) %>%
pull()
# Dummy LU - for each fruit give 1/0 whether or not they were in the meal
dummyLUInd <- data.frame(FruitName = fruitDF$Name, stringsAsFactors = FALSE)
for(i in unique(Meals$Run)){
selectedFruit <- Meals %>%
filter(Run == i) %>%
select(Fruit) %>%
mutate(Indicator = 1)
dummyLUIndTemp <- fruitDF %>%
left_join(selectedFruit, by = c('Name' = 'Fruit')) %>%
mutate(Indicator = ifelse(is.na(Indicator), 0, Indicator)) %>%
select(Indicator)
dummyLUInd <- cbind(dummyLUInd, dummyLUIndTemp)
}
## Table create
dummyLUInd <- rbind(dummyLUInd, dummyLUInd)[,-1]
dummyLUInd <- as.data.frame(t(dummyLUInd))
dummyLUInd$Total = 1
## Directions
dirLT <- c(rep('<=', (ncol(dummyLUInd)-1)/2))
dirGT <- c(rep('>=', (ncol(dummyLUInd)-1)/2))
## Multiply percentages by total Meals
MinExp = round(fruitDF$minSelectPct/100 * MealsNeeded - 0.499, 0)
MaxExp = round(fruitDF$maxSelectPct/100 * MealsNeeded + 0.499, 0)
# Setup constraints like # of tries
CounterMax <- 10000
LPSum = 0
Counter = 0
# Create DF to make it easier to change constraints for each run
MinExpDF <- data.frame(Place = 1:length(MinExp), MinExp = MinExp)
MaxExpDF <- data.frame(Place = 1:length(MaxExp), MaxExp = MaxExp)
cat('\nStarting\n')
Sys.sleep(2)
# Try to get the 200 of 600 Meals that satisfy the constraints for the individual Fruit.
# If the solution doesn't exist, loosen the constraints for each fruit (one at a time) until it does work
while (LPSum == 0 & Counter <= CounterMax) {
rowUse <- Counter %% length(MaxExp)
# Knock one of minimum, starting with highest exposure, one at a time
MinExpDF <- MinExpDF %>%
mutate(Rank = rank(-MinExp, na.last = FALSE, ties.method = "first"),
MinExp = ifelse(Rank == rowUse, MinExp - 1, MinExp)
)
MinExp <- MinExpDF$MinExp
# Add one of maximum, starting with highest exposure, one at a time
MaxExpDF <- MaxExpDF %>%
mutate(Rank = rank(-MaxExp, na.last = FALSE, ties.method = "first"),
MaxExp = ifelse(Rank == rowUse, MaxExp + 1, MaxExp))
MaxExp <- MaxExpDF$MaxExp
# Solve
dir <- 'max'
f.obj <- obj
f.mat <- t(dummyLUInd)
f.dir <- c(dirGT, dirLT, '==')
f.rhs <- c(MinExp, MaxExp, MealsNeeded)
Sol <- lp(dir, f.obj, f.mat, f.dir, f.rhs, all.bin = T)$solution
LPSum <- sum(Sol)
Counter = Counter + 1
if(Counter %% 10 == 0) cat(Counter, ', ', sep = '')
}
# Get the Run #'s from the lpSolve
if(Counter >= CounterMax){
cat("Unable to find right exposure, returning all Meals\n")
MealsSolved <- Meals
} else {
MealsSolved <- data.frame(Run = unique(Meals$Run))
MealsSolved$selected <- Sol
MealsSolved <- MealsSolved[MealsSolved$selected == 1,]
}
# Final Meals
FinalMeals <- Meals %>%
filter(Run %in% MealsSolved$Run)
如果您多次运行此代码,最终RStudio将冻结您(至少对我有用,如果对您不有用,我想增加用餐次数)。它是在实际的lp
期间发生的,因此,由于它实际上是C代码,因此您实际上无能为力。这是我迷路的地方。
我的一部分认为这并不是lpSolve
的问题,因为我并没有真正尝试最大化任何东西(Health-Factor并不是太重要)。我真正的“损失函数”是每个水果高于/低于其最小/最大暴露量的次数,但我想不起来如何设置这样的值。我目前的方法可以奏效,还是需要做一些完全不同的事情?