Question

我遇到了问题，无法在网上找到太多帮助。我需要从多个数字向量中找到数字的最小成本组合。所有向量的向量大小相同。例如，请考虑以下事项：

row [0]:  a  b  c  d   
row [1]:  e  f  g  h  
row [2]:  i  j  k  l

现在我需要从每一行中取一个元素，即矢量，找到数字组合，例如：aei
在此之后，我需要找到彼此不相交的其他三种组合，例如：bfj，cgk，dhl。我根据选择的这四种组合计算成本。目标是找到能够降低成本的组合。另一种可能的组合可以是：afj，bei，chk，dgl。如果列的总数是d并且行是k，则可能的总组合是d ^ k。行存储为向量。我被困在这里，我发现很难为上述过程编写算法。如果有人能提供帮助，我真的很感激感谢。

// I am still working on the algorithm. I just have the vectors and the cost function.  

//Cost Function  , it also depends on the path chosen
float cost(int a, int b, PATH to_a) {  
float costValue;  
...  
...  
return costValue;  
}  

vector< vector < int > > row;  
//populate row  
...   
...
//Suppose  

//    row [0]:  a  b  c  d   
//    row [1]:  e  f  g  h  
//    row [2]:  i  j  k  l   

// If a is chosen from row[0] and e is chosen from row[1] then,  
float subCost1 = cost(a,e, path_to_a);  

// If i is chosen from row[2] ,  
float subCost2 = cost(e,i,path_to_e);  

// Cost for selecting aei combination is  
float cost1 = subCost1 + subCost2;  

//similarly other three costs need to be calculated by selecting other remaining elements  
//The elements should not intersect with each other eg. combinations aei and bej cannot exist on the same set.  

//Suppose the other combinations chosen are bfj with cost cost2, cgk with cost cost3   and dhl with cost cost4  
float totalCost = cost1 + cost2 + cost3 + cost4;   

//This is the cost got from one combination. All the other possible combinations should be enumerated to get the minimum cost combination.

Answer 1

发布更多实用程序代码

见github：https://gist.github.com/1233012#file_new.cpp

这基本上是一种更好的方法，可以基于更简单的方法生成所有可能的排列（因此我没有真正的理由在之前发布它：现在看来，它不会做不仅仅是python代码。）

无论如何我决定分享它，因为你可以从中获得一些利润作为最终解决方案的基础。

临


快得多

更智能的算法（利用STL和数学：））

指令优化

存储优化



通用问题模型

模型和算法思想可以作为正确算法的基础

良好的OpenMP并行化（ n -way， n 行）的基础设计（但未充实）


魂斗罗：


代码以灵活性为代价更有效率：调整代码以构建关于约束的逻辑，使用更多逐步Python方法，成本启发式将更容易


总而言之，我认为我的C ++代码可能是大赢 IFF ，事实证明，模拟退火适用于成本函数;代码中采用的方法将给出


高效的存储模式

生成随机 /密切相关的新网格配置的高效方法

方便的显示功能



强制（abritrary ...）基准数据点（与python版本的比较：）
  a  b  c  d e
  f  g  h  i j
  k  l  m  n o
  p  q  r  s t

Result: 207360000

real  0m13.016s
user  0m13.000s
sys   0m0.010s

这是我们到现在为止所做的：

根据说明，我收集了一条基本图表，例如
必须构建一个访问网格中所有节点的路径（ Hamiltonian cycle ）。
额外的限制是后续节点必须从下一个排名 中获取（ad，呃，il是三个排名;一旦访问了最后一个 rank 的节点，该路径必须继续使用来自第一个 rank
边缘是加权的，因为它们具有相关的成本。然而，权重函数对于图算法来说并不常见，因为成本取决于完整路径，而不仅仅是每条边的终点。

鉴于此，我相信我们处于“全覆盖”问题领域（需要A *算法，最着名的是Knuths Dancing Links论文）。

具体没有进一步的信息（路径的等价，成本函数的特定属性），获得满足约束的“最便宜”的哈密尔顿路径的最着名算法将是

生成所有可能的此类路径
计算每个
选择最低费用路径

这就是为什么我出发并编写了一个非常愚蠢的蛮力发生器，它可以在NxM的通用网格中生成所有可能的唯一路径。

宇宙的终结

3×4样本网格的输出是4！³ = 13824个可能的路径...将其外推到6×48列，导致6！⁴⁸ = 1.4×10 ¹³⁷的可能性。很明显 没有进一步优化，问题就是无法解决 （NP Hard或其他东西 - 我从来不记得那些微妙的定义）。

运行时爆炸震耳欲聋：

3×4（测量）需要约0.175s
4×5（测量）花了大约6m5s（没有输出运行，在快速机器上运行PyPy 1.6）
5×6需要大约10年9个月......

在48x6，我们会看......什么...... 8.3x10 ¹⁰⁷ 光年（仔细阅读）

现场直播：http://ideone.com/YsVRE

无论如何，这里是python代码（所有预设为2×3网格）

#!/usr/bin/python
ROWS = 2
COLS = 3

## different cell representations
def cell(r,c): 
    ## exercise for the reader: _gues_ which of the following is the fastest
    ## ...
    ## then profile it :)
    index = COLS*(r) + c
    # return [ r,c ]
    # return ( r,c )
    # return index
    # return "(%i,%i)" % (r,c)

    def baseN(num,b,numerals="abcdefghijklmnopqrstuvwxyz"):
        return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])

    return baseN(index, 26)

ORIGIN = cell(0,0)

def debug(t): pass; #print t
def dump(grid): print("\n".join(map(str, grid)))

def print_path(path):
    ## Note: to 'normalize' to start at (1,1) node:
    # while ORIGIN != path[0]: path = path[1:] + path[:1] 
    print " -> ".join(map(str, path))

def bruteforce_hamiltonians(grid, whenfound):
    def inner(grid, whenfound, partial):

        cols = len(grid[-1]) # number of columns remaining in last rank
        if cols<1:
            # assert 1 == len(set([ len(r) for r in grid ])) # for debug only
            whenfound(partial)                             # disable when benchmarking
            pass
        else:
            #debug(" ------ cols: %i ------- " % cols)

            for i,rank in enumerate(grid):
                if len(rank)<cols: continue
                #debug("debug: %i, %s (partial: %s%s)" % (i,rank, "... " if len(partial)>3 else "", partial[-3:]))
                for ci,cell in enumerate(rank):
                    partial.append(cell)
                    grid[i] = rank[:ci]+rank[ci+1:] # modify grid in-place, keeps rank

                    inner(grid, whenfound, partial)

                    grid[i] = rank # restore in-place
                    partial.pop()
                break
        pass
    # start of recursion
    inner(grid, whenfound, [])

grid = [ [ cell(c,r) for r in range(COLS) ] for c in range(ROWS) ]

dump(grid)

bruteforce_hamiltonians(grid, print_path)

Answer 2

首先，一个观察非常有帮助。

我认为4！^ 3结果并没有捕捉到{aei，bfj，cgk，dhl}和（例如）{bfj，aei，cgk，dhl}具有相同的事实成本。

这意味着我们只需要考虑形式的序列

{ a??, b??, c??, d?? }

这种等价可以将不同案例的数量减少4倍！

另一方面，@ sehe有3x4给4！^ 3（我同意），所以类似6x48需要 48！^ 6 。其中“仅” 48！^ 5 是截然不同的。现在是2.95×10 ^ 305。

使用3x4示例，这是一个算法的开始，给出了某种答案。

Enumerate all the triplets and their costs. 
Pick the lowest cost triplet.
Remove all remaining triplets containing a letter from that triplet.
Now find the lowest cost triplet remaining.
And so on.

请注意，不是一次完整的详尽搜索。

我也从中看到，这仍然是很多计算。第一轮仍然需要计算48 ^ 6（12,230,590,464）成本。我想这可以做到，但需要付出很多努力。相比之下，后续的通行证会很便宜。

Answer 3

编辑：添加完整的解决方案

正如其他答案已经指出你的问题太难以面对蛮力。这类问题的出发点始终是Simulated annealing。我已经创建了一个实现该算法的小应用程序。

另一种查看问题的方法是最小化复杂功能。此外，您对可能的解决方案有额外的限制。我从随机有效配置（满足您的约束）开始，然后我修改了每次更改元素的随机解决方案。我强制应用程序执行有效的转换。代码很清楚。

我已经创建了一个模板函数，所以你只需要提供必要的函数对象和结构。

#include <iostream>
#include <cmath>
#include <ctime>
#include <vector>
#include <algorithm>
#include <functional>

//row [0]:  c00  c01  c02  c03   
//row [1]:  c10  c11  c12  c13  
//row [2]:  c20  c21  c22  c23 


typedef std::pair<int,int> RowColIndex;
// the deeper vector has size 3 (aei for example)
// the outer vector has size 4
typedef std::vector<std::vector<RowColIndex> > Matrix;

size_t getRandomNumber(size_t up)
{
    return rand() % up;
}

struct Configuration
{
    Configuration(const Matrix& matrix) : matrix_(matrix){}
    Matrix matrix_;
};

std::ostream& operator<<(std::ostream& os,const Configuration& toPrint)
{
    for (size_t row = 0; row < toPrint.matrix_.at(0).size(); row++)
    {
        for (size_t col = 0; col < toPrint.matrix_.size(); col++)
        {
            os << toPrint.matrix_.at(col).at(row).first  << "," 
               << toPrint.matrix_.at(col).at(row).second << '\t';
        }
        os << '\n';
    }   
    return os;
}

struct Energy 
{ 
    double operator()(const Configuration& conf)
    {
        double result = 0;
        for (size_t col = 0; col < conf.matrix_.size(); col++)
        {
            for (size_t row =0; row < conf.matrix_.at(col).size(); row++)
            {
                result += pow(static_cast<double>(row) - static_cast<double>(conf.matrix_.at(col).at(row).first),2) +
                          pow(static_cast<double>(col) - static_cast<double>(conf.matrix_.at(col).at(row).second),2);
            }
        }
        return result;
    }
};

size_t calculateNewColumn(std::vector<int>& isAlreadyUse)
{
    size_t random;
    do
    {
        random = getRandomNumber(isAlreadyUse.size());
    }
    while (isAlreadyUse.at(random) != 0);

    isAlreadyUse.at(random) = 1;
    return random;
}

Configuration createConfiguration(size_t numberOfRow,size_t numberOfColumn)
{
    //create suitable matrix
    Matrix matrix;
    //add empty column vector
    for (size_t col = 0; col < numberOfColumn; col++)
        matrix.push_back(std::vector<RowColIndex>());

    //loop over all the element
    for (size_t row = 0; row < numberOfRow; row++)
    {
        std::vector<int> isAlreadyUse(numberOfColumn);
        for (size_t col = 0; col < numberOfColumn; col++)
        {
            size_t newCol = calculateNewColumn(isAlreadyUse);
            matrix.at(newCol).push_back(std::make_pair(row,col));
        }
    }   

    return Configuration(matrix);
}


struct CreateNewConfiguration
{
    Configuration operator()(const Configuration& conf)
    {
        Configuration result(conf);

        size_t fromRow = getRandomNumber(result.matrix_.at(0).size());

        size_t fromCol = getRandomNumber(result.matrix_.size());
        size_t toCol = getRandomNumber(result.matrix_.size());

        result.matrix_.at(fromCol).at(fromRow) = conf.matrix_.at(toCol).at(fromRow);
        result.matrix_.at(toCol).at(fromRow) = conf.matrix_.at(fromCol).at(fromRow);

        return result;
    }
};

template<typename Conf,typename CalcEnergy,typename CreateRandomConf>
Conf Annealing(const Conf& start,CalcEnergy energy,CreateRandomConf createNewConfiguration,
               int maxIter = 100000,double minimumEnergy = 1.0e-005)
{
    Conf first(start);
    int iter = 0;
    while (iter < maxIter && energy(first) > minimumEnergy )
    {
        Configuration newConf(createNewConfiguration(first));
        if( energy(first) > energy(newConf))
        {
            first = newConf;
        }
        iter++;
    }
    return first;
}

int main(int argc,char* argv[])
{

    size_t nRows = 25;
    size_t nCols = 25;
    std::vector<Configuration> res;
    for (int i =0; i < 10; i++)
    {
        std::cout << "Configuration #" << i << std::endl;
        Configuration c = createConfiguration(nRows,nCols);
        res.push_back(Annealing(c,Energy(),CreateNewConfiguration()));
    }

    std::vector<Configuration>::iterator it = res.begin();


    std::vector<Configuration>::iterator lowest = it;
    while (++it != res.end())
    {
        if (Energy()(*it) < Energy()(*lowest))
            lowest = it;
    }

    std::cout << Energy()(*lowest) << std::endl;

    std::cout << std::endl;

    std::cout << *lowest << std::endl;


    std::cin.get();
    return 0;
}

当然，您无法保证解决方案是最好的解决方案（它是一种启发式方法）。不过这是一个很好的起点。

您还没有提供完整的功能成本，因此我实施了自己的功能，只需检查最终结果即可。您只需要提供功能成本并完成工作。

您可能会使程序更有效率，还有很大的改进空间，但逻辑就在那里，您可以轻松地实现您的功能。

<强>复杂性

算法的复杂性是E * I * C所在 I =迭代次数 C =随机配置数（避免局部最小值） E =能量函数（或函数成本）的计算

在这种情况下，E实际上是N * M，其中N和M是初始矩阵的维数。

如果您对模拟退火结果不满意，可以试试genetic algorithms。

Answer 4

您可以递归地解决问题。

方法的输入是要计算的第一个向量的索引，向量在函数外部共享。

对于剩余两行的情况，可以使用回溯计算解。在这种情况下，您只需要找到更便宜的配对。

对于有两行以上的情况，您应该使用下一个索引调用该方法，获取部分结果，然后再次使用回溯计算最小值。

当流程回到第一个向量时，您可以将结果合并为最终结果。

Answer 5

值得注意的是，对于一些有趣的路径成本选择，有一个多时间算法，例如，如果路径成本是边缘成本的总和，则可以通过运行找到最优解，对于所有i，第i行和第i + 1行的匈牙利算法。

通过c ++中的网格/矩阵找到成本优化的路径

5 个答案:

发布更多实用程序代码

临

魂斗罗：

宇宙的终结

现场直播：http://ideone.com/YsVRE