迭代计算可变数量元素的组合

时间:2016-09-25 21:26:41

标签: algorithm dynamic-programming

我正在制作一个程序,使用动态编程来决定如何在DVD中分发一些文件(电影),以便它使用最少数量的DVD。

经过深思熟虑之后我决定一个好方法就是看看每个可能的小于4.38 GB(DVD的实际大小)的电影组合并选择最大的一个(即浪费的那个)最小空间)并以最有效的组合删除电影并重复播放,直到电影耗尽。

问题在于我不知道如何循环所以我可以找出每个可能的组合,因为电影的大小不同,所以不能使用特定数量的嵌套循环。

伪码:

Some kind of loop:
    best_combination=[]
    best_combination_size=0
    if current_combination<4.38 and current_combination>best_combination_size:
        best_combination=current_combination
        best_combination_size=best_combination_size
print(best_combination)
delete best_combination from list_of_movies

第一次发一个问题..对我们来说很容易!! 提前致谢

P.S。我想出了一种使用Dijkstra的方法,我认为它会很快,但不是内存友好的。如果有人感兴趣,我很乐意讨论它。

2 个答案:

答案 0 :(得分:1)

你应该坚持common bin-packing heuristics。维基百科文章对方法进行了很好的概述,包括与问题定制的精确方法的链接。但请始终牢记:这是一个完整的问题!

我将向您展示一些支持我提示的示例,您应该坚持使用启发式方法。

以下python代码:

  • 创建参数化随机问题(在多个均值/标准点上进行正态分布;验收采样以确保没有电影比DVD更大)
  • 使用一些随机的binpacking-library来实现一些贪婪启发式(我之前没有尝试或测试过这个lib;所以没有保证!不知道使用哪种启发式方法)
  • 使用天真的混合整数编程实现(由商业解算器解决;开源解算器如cbc奋斗,但可能用于良好的近似解决方案)

代码

import numpy as np
from cvxpy import *
from time import time

""" Generate some test-data """
np.random.seed(1)
N = 150  # movies
means = [700, 1400, 4300]
stds = [100, 300, 500]
DVD_SIZE = 4400

movies = []
for movie in range(N):
    while True:
        random_mean_index = np.random.randint(low=0, high=len(means))
        random_size = np.random.randn() * stds[random_mean_index] + means[random_mean_index]
        if random_size <= DVD_SIZE:
            movies.append(random_size)
            break

""" HEURISTIC SOLUTION """
import binpacking
start = time()
bins = binpacking.to_constant_volume(movies, DVD_SIZE)
end = time()
print('Heuristic solution: ')
for b in bins:
    print(b)
print('used bins: ', len(bins))
print('used time (seconds): ', end-start)

""" Preprocessing """
movie_sizes_sorted = sorted(movies)
max_movies_per_dvd = 0
occupied = 0
for i in range(N):
    if occupied + movie_sizes_sorted[i] <= DVD_SIZE:
        max_movies_per_dvd += 1
        occupied += movie_sizes_sorted[i]
    else:
        break

""" Solve problem """
# Variables
X = Bool(N, N)  # N * number-DVDS
I = Bool(N)     # indicator: DVD used

# Constraints
constraints = []
# (1) DVDs not overfilled
for dvd in range(N):
    constraints.append(sum_entries(mul_elemwise(movies, X[:, dvd])) <= DVD_SIZE)
# (2) All movies distributed exactly once
for movie in range(N):
    constraints.append(sum_entries(X[movie, :]) == 1)
# (3) Indicators
for dvd in range(N):
    constraints.append(sum_entries(X[:, dvd]) <= I[dvd] * (max_movies_per_dvd + 1))

# Objective
objective = Minimize(sum_entries(I))

# Problem
problem = Problem(objective, constraints)
start = time()
problem.solve(solver=GUROBI, MIPFocus=1, verbose=True)
#problem.solve(solver=CBC, CliqueCuts=True)#, GomoryCuts=True, KnapsackCuts=True, verbose=True)#, GomoryCuts=True, MIRCuts=True, ProbingCuts=True,
              #CliqueCuts=True, FlowCoverCuts=True, LiftProjectCuts=True,
              #verbose=True)
end = time()

""" Print solution """
for dvd in range(N):
    movies_ = []
    for movie in range(N):
        if np.isclose(X.value[movie, dvd], 1):
            movies_.append(movies[movie])
    if movies_:
        print('DVD')
        for movie in movies_:
            print('     movie with size: ', movie)

print('Distributed ', N, ' movies to ', int(objective.value), ' dvds')
print('Optimizatio took (seconds): ', end-start)

部分输出

Heuristic solution:
-------------------
('used bins: ', 60)
('used time (seconds): ', 0.0045168399810791016)

MIP-approach:
-------------
Root relaxation: objective 2.142857e+01, 1921 iterations, 0.10 seconds

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0   21.42857    0  120  106.00000   21.42857  79.8%     -    0s
H    0     0                      68.0000000   21.42857  68.5%     -    0s
H    0     0                      63.0000000   21.42857  66.0%     -    0s
     0     0   21.42857    0  250   63.00000   21.42857  66.0%     -    1s
H    0     0                      62.0000000   21.42857  65.4%     -    2s
     0     0   21.42857    0  256   62.00000   21.42857  65.4%     -    2s
     0     0   21.42857    0  304   62.00000   21.42857  65.4%     -    2s
     0     0   21.42857    0  109   62.00000   21.42857  65.4%     -    3s
     0     2   21.42857    0  108   62.00000   21.42857  65.4%     -    4s
    40     2   27.61568   20   93   62.00000   27.61568  55.5%   110    5s
H  156    10                      61.0000000   58.00000  4.92%  55.3    8s
   262     4   59.00000   84   61   61.00000   59.00000  3.28%  44.2   10s
   413    81 infeasible  110        61.00000   59.00000  3.28%  37.2   15s
H  417    78                      60.0000000   59.00000  1.67%  36.9   15s
  1834  1212   59.00000  232   40   60.00000   59.00000  1.67%  25.7   20s
...
...
 57011 44660 infeasible  520        60.00000   59.00000  1.67%  27.1  456s
 57337 44972   59.00000  527   34   60.00000   59.00000  1.67%  27.1  460s
 58445 45817   59.00000   80   94   60.00000   59.00000  1.67%  26.9  466s
 59387 46592   59.00000  340   65   60.00000   59.00000  1.67%  26.8  472s

分析

关于上述例子的一些观察:

  • 启发式立即获得值60的解决方案
  • 商业解算者需要更多时间,但也找到价值60(15秒)的解决方案
    • 也试图找到一个更好的解决方案或证明没有人(MIP解算器完整=找到最佳解决方案或证明没有给定无限时间!)
    • 一段时间没有进展!
    • 但是:我们得到了证明,充其量只有59的解决方案
    • = 也许您可以通过最佳解决问题来保存一张DVD;但很难找到解决方案,我们也不知道这个解决方案是否存在(还)!

说明

  • 上述观察结果严重依赖于数据统计
  • 很容易尝试其他问题(可能更小),商业MIP解算器找到一个使用少一张DVD的解决方案(例如49对50)
    • 它不值得(记住:开源解算器正在努力奋斗)
    • 配方非常简单,根本没有调整(不要只责怪解决者)
  • 有适当的算法(实现可能要复杂得多)

结论

启发式扫描很容易实现,并提供非常好的解决方案。其中大多数还具有非常好的理论保证(例如,与使用最佳解决方案相比,最多11/9 opt + 1 #DVDs =首先适合减少启发式算法)。尽管我一直热衷于优化,但我可能会在这里使用启发式方法。

一般问题也很受欢迎,因此在许多编程语言中应该存在一些很好的库来解决这个问题!

答案 1 :(得分:0)

没有声称,这个答案提出的解决方案是优化的,最佳的或具有任何其他显着的品质,这里是dvd包装问题的贪婪方法。

import System.Random
import Data.List
import Data.Ord

-- F# programmers are so used to this operator, I just cannot live without it ...yet.
(|>) a b = b a 
data Dvd = Dvd { duration :: Int, movies :: [Int] } deriving (Show,Eq)

dvdCapacity = 1000 :: Int -- a dvd has capacity for 1000 time units - arbitrary unit
-- the duration of a movie is between 1 and 1000 time units
r = randomRs (1,1000) (mkStdGen 42) :: [Int] 
-- our test data set of 1000 movies, random movie durations drawn from r
allMovies = zip [1..1000] (take 1000 r)     
allMoviesSorted = reverse $ sortBy (comparing snd) allMovies
remainingCapacity dvd = dvdCapacity - duration dvd

emptyDvd = Dvd { duration = 0, movies = [] }

-- from the remaining movies, pick the largest one with at most maxDuration length.
pickLargest remaining maxDuration =
    let (left,right) = remaining |> break (\ (a,b) -> b <= maxDuration)
        (h,t) = case right of 
            [] -> (Nothing,[]) 
            otherwise -> (Just (head right), right |> tail)
        in 
            (h,[left, t] |> concat)

-- add a track (movie) to a dvd
addTrack dvd track = 
    Dvd {duration = (duration dvd) + snd track, 
        movies = fst track : (movies dvd) }

-- pick dvd from dvds with largest remaining capacity 
-- and add the largest remaining fitting track
greedyPack movies dvds 
    | movies == [] = (dvds,[])
    | otherwise =

        let dvds' = reverse $ sortBy (comparing remainingCapacity) dvds
            (picked,movies') =
                case dvds' of
                    [] -> (Nothing, movies)
                    (x:xs) -> pickLargest movies (remainingCapacity x)
            in
                case picked of
                    Nothing ->
                        -- None of the current dvds had enough capacity remaining
                        -- tp pick another movie and add it. -> Add new empty dvd
                        -- and run greedyPack again. 
                        greedyPack movies' (emptyDvd : dvds')
                    Just p ->
                        -- The best fitting movie could be added to the 
                        -- dvd with the largest remaining capacity.
                        greedyPack movies' (addTrack (head dvds') p : tail dvds')   

(result,residual) = greedyPack allMoviesSorted [emptyDvd] 
usedDvdsCount = length result
totalPlayTime = allMovies |> foldl (\ s (i,d) -> s + d) 0
optimalDvdsCount = round $ 0.5 + fromIntegral totalPlayTime / fromIntegral dvdCapacity
solutionQuality = length result - optimalDvdsCount

与理论上的最佳dvd计数相比,它在给定的数据集上浪费了4个额外的dvds。