分析：

Question

我在一次训练中遇到了这个问题。即我们已经给出了N个不同的值（N<= 100）。让我们为这个数组命名A[N]，对于这个数组A，我们确定数组中有1，A[i]≤10⁹。其次，我们给出了S，其中S≤10⁹。

现在我们必须用这个值解决经典硬币问题。实际上，我们需要找到最小数量的元素，它将精确地加总S。来自A的每个元素都可以无限次使用。

时间限制：1秒
内存限制：256 MB

示例：

S = 1000, N = 10

A[] = {1,12,123,4,5,678,7,8,9,10}. The result is 10.

1000 = 678 + 123 + 123 + 12 + 12 + 12 + 12 + 12 + 12 + 4

我尝试了什么

我试图用经典的动态编程硬币问题技术来解决这个问题，但是它使用了太多内存并且超出了内存限制。

我无法弄清楚我们应该保留这些价值观。提前谢谢。

以下是使用经典dp硬币问题无法解决的几个测试用例。

S = 1000000000 N = 100

1 373241370 973754081 826685384 491500595 765099032 823328348 462385937 
251930295 819055757 641895809 106173894 898709067 513260292 548326059 
741996520 959257789 328409680 411542100 329874568 352458265 609729300 
389721366 313699758 383922849 104342783 224127933 99215674 37629322 
230018005 33875545 767937253 763298440 781853694 420819727 794366283 
178777428 881069368 595934934 321543015 27436140 280556657 851680043 
318369090 364177373 431592761 487380596 428235724 134037293 372264778 
267891476 218390453 550035096 220099490 71718497 860530411 175542466 
548997466 884701071 774620807 118472853 432325205 795739616 266609698 
242622150 433332316 150791955 691702017 803277687 323953978 521256141 
174108096 412366100 813501388 642963957 415051728 740653706 68239387 
982329783 619220557 861659596 303476058 85512863 72420422 645130771 
228736228 367259743 400311288 105258339 628254036 495010223 40223395 
110232856 856929227 25543992 957121494 359385967 533951841 449476607 
134830774
OUTPUT FOR THIS TEST CASE: 5

S = 999865497 N = 7

1 267062069 637323855 219276511 404376890 528753603 199747292
OUTPUT FOR THIS TEST CASE: 1129042

S = 1000000000 N = 40

1 12 123 4 5 678 7 8 9 10 400 25 23 1000 67 98 33 46 79 896 11 112 1223 412 
532 6781 17 18 19 170 1400 925 723 11000 607 983 313 486 739 896
OUTPUT FOR THIS TEST CASE: 90910

Answer 1

（注意：为了清晰起见进行了更新和编辑。最后添加了复杂性分析。）

好的，这是我的解决方案，包括我对@PeterdeRivaz发现的性能问题的修复。我已经针对问题和评论中提供的所有测试用例对此进行了测试，并且它在一秒钟内完成所有测试（在一种情况下为1.5秒），主要仅使用内存用于部分结果缓存（I＆＃39; d猜大约16MB）。

我使用Depth-First，Greedy-First组合搜索，使用当前最佳结果进行修剪，而不是使用传统的DP解决方案（速度太慢且需要太多内存）。我很惊讶（非常）它的工作原理和它一样，但我仍然怀疑你可以构建一个最坏情况指数时间的测试集。

首先，有一个主函数是调用代码需要调用的唯一函数。它处理所有设置和初始化并调用其他所有内容。（所有代码都是C＃）

// Find the min# of coins for a specified sum
int CountChange(int targetSum, int[] coins)
{
    // init the cache for (partial) memoization
    PrevResultCache = new PartialResult[1048576];

    // make sure the coins are sorted lowest to highest
    Array.Sort(coins);

    int curBest = targetSum;
    int result = CountChange_r(targetSum, coins, coins.GetLength(0)-1, 0, ref curBest);

    return result;
}

由于@PeterdeRivaz提出的测试用例存在问题，我还添加了一个部分结果缓存来处理N []中存在大量数字的情况。

以下是缓存的代码：

    // implement a very simple cache for previous results of remainder counts
    struct PartialResult
    {
        public int PartialSum;
        public int CoinVal;
        public int RemainingCount;
    }
    PartialResult[] PrevResultCache;

    // checks the partial count cache for already calculated results
    int PrevAddlCount(int currSum, int currCoinVal)
    {
        int cacheAddr = currSum & 1048575;  // AND with (2^20-1) to get only the first 20 bits
        PartialResult prev = PrevResultCache[cacheAddr];

        // use it, as long as it's actually the same partial sum 
        // and the coin value is at least as large as the current coin
        if ((prev.PartialSum == currSum) && (prev.CoinVal >= currCoinVal))
        {
            return prev.RemainingCount;
        }
        // otherwise flag as empty
        return 0;
    }

    // add or overwrite a new value to the cache
    void AddPartialCount(int currSum, int currCoinVal, int remainingCount)
    {
        int cacheAddr = currSum & 1048575;  // AND with (2^20-1) to get only the first 20 bits
        PartialResult prev = PrevResultCache[cacheAddr];

        // only add if the Sum is different or the result is better
        if ((prev.PartialSum != currSum)
            || (prev.CoinVal <= currCoinVal)
            || (prev.RemainingCount == 0)
            || (prev.RemainingCount >= remainingCount)
            )
        {
            prev.PartialSum = currSum;
            prev.CoinVal = currCoinVal;
            prev.RemainingCount = remainingCount;
            PrevResultCache[cacheAddr] = prev;
        }
    }

以下是执行实际计数的递归函数的代码：

/*
* Find the minimum number of coins required totaling to a specifuc sum
* using a list of coin denominations passed.
*
* Memory Requirements: O(N)  where N is the number of coin denominations
*                            (primarily for the stack)
* 
* CPU requirements:  O(Sqrt(S)*N) where S is the target Sum
*                           (Average, estimated.  This is very hard to figure out.)
*/
int CountChange_r(int targetSum, int[] coins, int coinIdx, int curCount, ref int curBest)
{
    int coinVal = coins[coinIdx];
    int newCount = 0;

    // check to see if we are at the end of the search tree (curIdx=0, coinVal=1)
    // or we have reached the targetSum
    if ((coinVal == 1) || (targetSum == 0))
    {
        // just use math get the final total for this path/combination 
        newCount = curCount + targetSum;
        // update, if we have a new curBest
        if (newCount < curBest) curBest = newCount;
        return newCount;
    }

    // prune this whole branch, if it cannot possibly improve the curBest
    int bestPossible = curCount + (targetSum / coinVal);
    if (bestPossible >= curBest) 
            return bestPossible; //NOTE: this is a false answer, but it shouldnt matter
                                    //  because we should never use it.

    // check the cache to see if a remainder-count for this partial sum
    // already exists (and used coins at least as large as ours)
    int prevRemCount = PrevAddlCount(targetSum, coinVal);
    if (prevRemCount > 0)
    {
        // it exists, so use it
        newCount = prevRemCount + targetSum;
        // update, if we have a new curBest
        if (newCount < curBest) curBest = newCount;
        return newCount;
    }

    // always try the largest remaining coin first, starting with the 
    // maximum possible number of that coin (greedy-first searching)
    newCount = curCount + targetSum;
    for (int cnt = targetSum / coinVal; cnt >= 0; cnt--)
    {
        int tmpCount = CountChange_r(targetSum - (cnt * coinVal), coins, coinIdx - 1, curCount + cnt, ref curBest);

        if (tmpCount < newCount) newCount = tmpCount;
    }

    // Add our new partial result to the cache
    AddPartialCount(targetSum, coinVal, newCount - curCount);

    return newCount;
}

分析：

内存：

此算法很容易确定内存使用情况。基本上只有部分结果缓存和堆栈。缓存固定在appx。 100万个条目乘以每个条目的大小（3 * 4字节），因此大约12MB。堆栈限制为O(N)，因此记忆显然不是问题。

CPU：

这个算法的运行时复杂性开始很难确定然后变得更难，所以请原谅我，因为这里有很多挥手。我试图寻找一个只有蛮力问题的分析（组合搜索N * kn基值总和为S的总和），但没有多少出现。那里的一点点往往是O(N^S)，这显然太高了。我认为更公平的估计是O(N^(S/N))或可能O(N^(S/AVG(N))甚至是O(N^(S/(Gmean(N)))，其中Gmean(N)是N []元素的几何平均数。该解决方案从蛮力组合搜索开始，然后通过两次重要优化对其进行改进。

第一个是基于对该分支的最佳可能结果的估计与已经发现的最佳结果的修剪分支。如果最佳案例估计完全准确并且分支的工作分布完美，那么这意味着如果我们发现结果优于其他可能情况的90％，那么修剪将有效地消除90％的工作。那一点。总而言之，这应该证明修剪后仍然存在的工作量随着进展而会和谐地缩小。假设应该应用某种求和/积分来获得工作总量，这在我看来可以解决原始工作的对数。因此，我们称之为O(Log(N^(S/N))或O(N*Log(S/N))，这非常好。（虽然O(N*Log(S/Gmean(N)))可能更准确）。

然而，这有两个明显的漏洞。首先，最佳情况估计器确实不完全准确，因此它们不会像上面假设的那样有效地修剪，但是，这有点与贪婪的第一顺序相反。在搜索早期找到更好的解决方案的分支机构，可以提高修剪的效率。

第二个问题是，当N的不同值相距很远时，最佳情况估计器的效果会更好。具体来说，如果|(S/n2 - S/n1)| > 1对于N中的任何2个值，那么它几乎完全有效。对于小于SQRT（S）的N的值，则即使两个相邻值（k，k + 1）也足够远以至于该规则适用。然而，为了增加SQRT（S）以上的值，窗口打开，使得该窗口内的任何数量的N值将不能有效地相互修剪。该窗口的大小约为K / SQRT（S）。因此，如果S = 10 ^ 9，当K大约为10 ^ 6时，该窗口将近30个数字宽。这意味着N []可以包含1加上1000001到1000029之间的每个数字，修剪优化几乎没有任何好处。

为了解决这个问题，我添加了部分结果缓存，它允许记忆最近的部分总和直到目标S.这利用了这样的事实：当N值靠近在一起时，它们往往会有一个其总和中的重复数量极高。尽管我可以想象，这个有效性大约是问题大小的第J个根的N倍，其中J = S/K和K是N值的平均大小的一些度量（Gmean（N）可能是最好的估计）。如果我们将此应用于强力组合搜索，假设修剪无效，我们得到O((N^(S/Gmean(N)))^(1/Gmean(N)))，我认为这也是O(N^(S/(Gmean(N)^2)))。

所以，此时请选择。我知道这是非常粗略的，即使它是正确的，它仍然对N值的分布非常敏感，因此存在很多差异。

Answer 2

[我已经取代了之前关于位操作的想法，因为它似乎太费时了]

有点疯狂的想法并且不完整但可能有用。

让我们首先介绍f(n,s)，它会返回s可以从n个硬币组成的组合数。

现在，f(n+1,s)与f(n)的关系如何？

计算它的一种可能方法是：

f(n+1,s)=sum[coin:coins]f(n,s-coin)

例如，如果我们有硬币1和3，

f(0,)=[1,0,0,0,0,0,0,0] - 零硬币我们只有零和

f(1,)=[0,1,0,1,0,0,0,0] - 我们用一枚硬币可以拥有什么

f(2,)=[0,0,1,0,2,0,1,0] - 我们可以拥有两枚硬币

我们可以稍微改写一下：

f(n+1,s)=sum[i=0..max]f(n,s-i)*a(i)

a(i)=1如果我们有硬币i而0则另有

我们这里有卷积：f(n+1,)=conv(f(n,),a)

https://en.wikipedia.org/wiki/Convolution

将其计算为定义建议给出O(n^2)

但我们可以使用傅里叶变换将其减少到O（n * log n）。

https://en.wikipedia.org/wiki/Convolution#Convolution_theorem

所以现在我们有了或多或少的廉价方式来查找n硬币可以使用的数字而不会逐步增加 - 只需计算n-th F(a)的幂并应用傅里叶逆变换。

这允许我们进行一种二进制搜索，这有助于在答案很大时处理案例。

正如我所说，这个想法是不完整的 - 现在我不知道如何将位表示与傅里叶变换（以满足内存约束）以及我们是否在任何“常规”CPU上适应1秒......

内存受限的硬币数量变化达到10亿

2 个答案:

分析：

内存：

CPU：