RStudio没有响应大型线性优化

时间:2018-10-28 00:20:37

标签: r linear-programming lpsolve

我有一个很大的两步优化问题,我试图对此问题进行简化。第一步是选择10个元素,以在某些约束条件下最大化效用。我需要这些集合中的200个,但是由于我要执行的操作的性质,需要生成600个集合,以便可以显示正确的组合。

围绕这些最小优化问题是一个较大的约束,其中每个单独的元素只能在一定范围内使用。第一个优化调整了每个元素的效用,以使每个元素都相对接近边界,但是不可能所有元素都在其边界内。因此,第二步是从600套中选择200套,以便满足每个单独元素的最小/最大使用量。这就是我需要的帮助。

我使用lpSolve制作了一个功能,但是在超过80%的时间里它冻结了RStudio,这变得太麻烦了-我需要改进当前的方法,或者需要全新的方法方法。我不知道lpSolve是否真的是最好的开始。虽然我确实有一个可以最大化的整体设定得分,但我真正关心的是将每个元素都包含在范围之内。我做了一个简化的例子来弄清问题的本质。


我负责用80种不同的水果制作200顿饭。每顿饭使用10个水果,同一水果不能超过1个。我的水果数量有限(老板让我最少使用每种水果,否则它们会变质),因此它们必须在一定范围内。我已经列出了600顿饭(Meals),每顿饭都有自己独特的Health-Score。理想情况下,我希望最大程度地提高“健康评分”,但是最重要的一点是,每个水果的使用次数均正确,否则,一开始就无法进餐。

这是我的代码:1)设置600餐(随机)2)设置每个水果必须使用的最小/最大时间(随机)3)运行线性优化以从600餐中选择200餐,以便单个水果约束得到满足。该程序尝试从600中选择200,但是如果约束不允许,则放松约束(例如,如果求解器第一次不工作,那么我将减少Apple的最小次数可以使用,并增加可以使用的最大次数)。它一次完成一个结果,而不是一次完成。最终,应该放宽约束,以至于600个中的任何200个都可以工作(即,当所有水果minPercent小于0且所有水果maxPercent大于100时),但这并不重要,因为R冻结了。

library(stringr)
library(dplyr)
library(lpSolve)

# Inputs
MealsNeeded <- 200
Buffer <- 3

# Setup the meals (this is the output of another optimizer in my actual program. Considered "Step 1" as I mentioned above)
Meals <- data.frame()
for(i in 1:(MealsNeeded*Buffer)){

  run <- i
  meal <- sample(fruit, 10)
  healthFactor <- round(runif(1, 10, 30), 0) #(Health factor for the entire meal)

  df <- data.frame(Run = run, Fruit = meal, healthFactor = healthFactor, stringsAsFactors = FALSE)

  Meals <- rbind(Meals, df)

}

# The minimum/maximum number of times each fruit must be used across all 200 meals (these would be inputs in my program)
set.seed(11)
fruitDF <- data.frame(Name = fruit, minSelectPct = round(runif(length(fruit), .05, .1)*100, 0), stringsAsFactors = FALSE) %>% 
  mutate(maxSelectPct = round(minSelectPct/2 + runif(length(fruit), .05, .1)*100, 0))

#### Actual Program Start

# Get objective
obj <- Meals %>% 
  distinct(Run, healthFactor) %>% 
  ungroup() %>% 
  select(healthFactor) %>% 
  pull()

# Dummy LU - for each fruit give 1/0 whether or not they were in the meal
dummyLUInd <- data.frame(FruitName = fruitDF$Name, stringsAsFactors = FALSE)
for(i in unique(Meals$Run)){

  selectedFruit <- Meals %>%
    filter(Run == i) %>% 
    select(Fruit) %>% 
    mutate(Indicator = 1)

  dummyLUIndTemp <- fruitDF %>% 
    left_join(selectedFruit, by = c('Name' = 'Fruit')) %>% 
    mutate(Indicator = ifelse(is.na(Indicator), 0, Indicator)) %>% 
    select(Indicator)

  dummyLUInd <- cbind(dummyLUInd, dummyLUIndTemp)
}

## Table create
dummyLUInd <- rbind(dummyLUInd, dummyLUInd)[,-1]
dummyLUInd <- as.data.frame(t(dummyLUInd))
dummyLUInd$Total = 1

## Directions
dirLT <- c(rep('<=', (ncol(dummyLUInd)-1)/2))
dirGT <- c(rep('>=', (ncol(dummyLUInd)-1)/2))
## Multiply percentages by total Meals
MinExp = round(fruitDF$minSelectPct/100 * MealsNeeded - 0.499, 0) 
MaxExp = round(fruitDF$maxSelectPct/100 * MealsNeeded + 0.499, 0)

# Setup constraints like # of tries
CounterMax <- 10000
LPSum = 0
Counter = 0

# Create DF to make it easier to change constraints for each run
MinExpDF <- data.frame(Place = 1:length(MinExp), MinExp = MinExp)
MaxExpDF <- data.frame(Place = 1:length(MaxExp), MaxExp = MaxExp)
cat('\nStarting\n')
Sys.sleep(2)

# Try to get the 200 of 600 Meals that satisfy the constraints for the individual Fruit.
# If the solution doesn't exist, loosen the constraints for each fruit (one at a time) until it does work
while (LPSum == 0 & Counter <= CounterMax) {
  rowUse <- Counter %% length(MaxExp)

  # Knock one of minimum, starting with highest exposure, one at a time
  MinExpDF <- MinExpDF %>%
    mutate(Rank = rank(-MinExp, na.last = FALSE, ties.method = "first"),
           MinExp = ifelse(Rank == rowUse, MinExp - 1, MinExp)
    )
  MinExp <- MinExpDF$MinExp

  # Add one of maximum, starting with highest exposure, one at a time
  MaxExpDF <- MaxExpDF %>%
    mutate(Rank = rank(-MaxExp, na.last = FALSE, ties.method = "first"),
           MaxExp = ifelse(Rank == rowUse, MaxExp + 1, MaxExp))
  MaxExp <- MaxExpDF$MaxExp


  # Solve
  dir <- 'max'
  f.obj <- obj
  f.mat <- t(dummyLUInd)
  f.dir <- c(dirGT, dirLT, '==')
  f.rhs <- c(MinExp, MaxExp, MealsNeeded)
  Sol <- lp(dir, f.obj, f.mat, f.dir, f.rhs, all.bin = T)$solution
  LPSum <- sum(Sol)

  Counter = Counter + 1
  if(Counter %% 10 == 0) cat(Counter, ', ', sep = '')
}

# Get the Run #'s from the lpSolve
if(Counter >= CounterMax){
  cat("Unable to find right exposure, returning all Meals\n")
  MealsSolved <- Meals
} else {
  MealsSolved <- data.frame(Run = unique(Meals$Run))
  MealsSolved$selected <- Sol
  MealsSolved <- MealsSolved[MealsSolved$selected == 1,]
}
# Final Meals
FinalMeals <- Meals %>% 
  filter(Run %in% MealsSolved$Run)

如果您多次运行此代码,最终RStudio将冻结您(至少对我有用,如果对您不有用,我想增加用餐次数)。它是在实际的lp期间发生的,因此,由于它实际上是C代码,因此您实际上无能为力。这是我迷路的地方。

我的一部分认为这并不是lpSolve的问题,因为我并没有真正尝试最大化任何东西(Health-Factor并不是太重要)。我真正的“损失函数”是每个水果高于/低于其最小/最大暴露量的次数,但我想不起来如何设置这样的值。我目前的方法可以奏效,还是需要做一些完全不同的事情?

0 个答案:

没有答案