Question

我想了解如何解决Codility ArrayRecovery难题，但是我什至不知道该咨询哪些知识。是组合学，优化，计算机科学，集合论还是其他东西？

编辑： 要咨询的知识分支是constraint programming，尤其是constraint propagation。您还需要一些combinatorics才能知道，如果一次从范围[1 .. n ]中提取 k 个数字，并且限制为没有数字可以比之前的那个更大 （n + k-1）！/ k！（n-1）！可能的组合这与一次用 k 取 n 的 n个替换项的组合数量相同，其数学符号为_{< sub>} 。您可以了解其为何如此here。

彼得·诺维格（Peter Norvig）提供了一个很好的例子，说明了如何用他的Sudoku solver解决此类问题。

您可以通过上面的链接阅读ArrayRecovery问题的完整说明。简短的说是，有一个 encoder 接受一个整数序列，范围为1到某个给定的限制（例如100），而输入序列的每个元素输出小于当前输入的最近看到的整数，如果不存在则为0 。

input 1, 2, 3, 4 => output 0, 1, 2, 3
input 2, 4, 3 => output 0, 2, 2

给定输出（以及允许的输入范围），完整的任务就是找出可能产生了多少输入。但是，在开始计算之前，我对如何公式化公式还没有信心。这就是我寻求帮助的地方。（当然，如果有解释的话，当然也欢迎提供完整的解决方案。）

我只是看一些可能的输出而感到奇怪。以下是一些示例编码器输出以及我可以提供的输入，其中*表示任何有效输入，而> 4则表示任何大于4的有效输入。如果需要，输入称为{ {1}}（基于1的索引）

编辑＃2

这个挑战的部分问题是我没有手动为输出生成完全正确的可能输入集。我相信下面的设置现在是正确的。如果要查看我以前的错误，请查看此答案的编辑历史记录。

A1, A2, A3, ...

与第二个输入序列相比，第二个输入序列受严格限制，仅需增加两个输出即可。第三序列如此严格以至于不可能。

但是示例2中对A5的约束集很难说清楚。当然，A5> O5，这是所有输入的基本约束。但是，任何大于A4且在O5之后的输出都必须出现在A4之后的输入中，因此A5必须是A5之后也大于A4的一组数字的元素。由于这样的数字只有1个（A6 == 4），所以必须是A5，但是如果后面跟随一串更长的数字，它将变得更加复杂。 （编者注：实际上不是。）

随着输出集变得更长，我担心这些约束只会变得更加复杂，更难于正确。我无法想到任何数据结构可以有效地表示这些数据结构，从而有效地计算出可能的组合数量。我也不太清楚如何在算法上将约束集加在一起。

这是到目前为止，对于任何给定的A _n

A _n> O _n
A _n <= min（从O ₁到_n-1> O _{n ）。如何定义大于O _n的可能数字集？

大于O _n的数字出现在输入中最近一次出现O _n之后}
A _n> = max（从O ₁到_n-1
n ）。如何定义小于O _n的可能数字集？
1. 实际上，该集合为空，因为根据定义，O _n是上一个输入序列中最大的可能数。（这并不是说严格来说是前一个输入序列中的最大数字。）
2. 在输入中最后一次出现之前小于O _n的任何数字都是不合格的，因为“最近”规则。由于“最近”规则和传递特性，在最近一次发生之后，不可能出现比O _n小的数字：if A _i n 和A _j i ，然后A _j n

然后是集合论：

A _n必须是O _{n + 1}到O _m的集合中未解释元素的集合，其中m是最小的m> n，使得O _m n 。在这样的O _m之后且大于O _m（A _n是）的任何输出都必须在A _{m或之后出现}。
如果在输出中看到该元素但该元素未出现在输入中且与输出的其余部分一致的位置，则该元素是未知的。显然，我需要一个比此更好的定义才能进行编码和算法计算。

似乎某种集合论和/或组合论或线性代数将有助于找出可能的数量，这些数量将解释所有未解释的输出并满足其他约束。 （编者注：实际上，事情永远不会变得那么复杂。）

Answer 1

下面的代码通过了Codility的所有测试。 OP添加了main函数以在命令行上使用它。

约束并不像OP认为的那么复杂。特别是，从来没有需要添加限制的条件，即输入必须是输出中其他地方看到的某些特定整数集的元素。每个输入位置都有明确定义的最小值和最大值。

该规则的唯一复杂之处在于，有时最大值是“先前输入的值”，并且输入本身具有范围。但是即使那样，所有类似的值都是连续的并且具有相同的范围，因此可以使用基本组合运算来计算可能性的数量，并且作为一组的那些输入独立于其他输入（仅用于设置范围），因此该组的可能性可以通过简单的乘法与其他输入位置的可能性合并。

算法概述

该算法使输出数组单次通过，在每个 span 之后更新输入数组的可能数量，这就是我所说的输出中数字的重复。（您可能会说每个元素都相同的输出的最大子序列。）例如，对于输出0,1,1,2，我们有三个范围：0，1,1和2。当新的跨度开始时，将计算前一个跨度的可能性数。

此决定基于以下观察结果：

对于 spans 长度大于1的输入的最小值在第一个位置允许的是输入值在第二位置。计算一个可能性的数量 span是简单的组合法，但是标准公式需要知道数字的范围和跨度的长度。
每次的值输出更改（和新的跨度），强烈限制了先前跨度的值：
1. 当输出上升时，唯一可能的原因是先前的输入是更高的新输出的值，并且对应于更高的新输出位置的输入甚至更高。
2. 当输出下降时，会建立新的约束，但这些约束很难表达。该算法存储楼梯（见下文），以便量化在输出下降时施加的约束

此处的目的是限制每个 span 的可能值范围。一旦我们准确地做到了，就可以轻松计算出组合数量。

由于编码器回溯试图以两种方式输出与输入相关的数字，无论是较小的还是靠近的，我们知道我们可以抛出更大或更远的数字。在输出中出现少量数字之后，该位置之前的任何数字都不会对后面的内容产生任何影响。

因此，要在输出序列减少时限制这些输入范围，我们需要存储 stairs -列表中原始数组中位置的可能值越来越大的列表。例如，0,2,5,7,2,4楼梯是这样建立的：0，0,2，0,2,5，0,2,5,7，0,2，0,2,4。 / p>

使用这些界限，我们可以确定第二个2位置（示例中最后一个位置旁边）的数字必须位于(2,5]中，因为5是下一个楼梯。如果输入大于5，则在该空间中将输出5而不是2。请注意，如果编码数组中的最后一个数字不是4，而是6，我们将退出，早点返回0，因为我们知道先前的数字不能大于5。

复杂度为O(n*lg(min(n,m)))。

功能

CombinationsWithReplacement-从k个数字中计数大小为n的{{3}}。例如。对于(3, 2)，它的计数是3,3，3,2，3,1，2,2，2,1，1,1，因此返回{{1} }与6相同。

choose(n - 1 + k, n - 1)-查找范围中的下一个更大的元素。例如。对于子数组nextBigger中的4，它返回1,2,3,4,5，在子数组5中，它返回其参数1,3。

Max（拉姆达）-计算我们刚刚通过的跨度可以有多少种不同的组合。考虑countSpan的跨度2,2。

当0,2,5,7,2,2,7到达最终位置时，curr是curr，而7是prev范围的最后2。 / li>
它计算2,2跨度的最大和最小可能值。此时，楼梯由prev组成，则最大可能值为2,5,7（5中nextBigger之后的2）。在此范围内，大于stair 2,5,7的值将输出5，而不是5。

它计算跨度的最小值（这是跨度中每个元素的最小值），此时为2，（此时请记住prev等于{{ 1}}和curr至7）。我们确定，原始输入必须具有prev才能代替最终的2输出，因此最小值应为2。（这是“输出增加”规则的结果。如果我们有7，而7将是7,7,2，那么前一个跨度的最小值（curr）将是2，即7,7。

它调整组合数量。对于长度为 L 且范围为 n （1 + max-min）的范围，存在 number of combinations with replacements个可能性，其中 k 是 L 还是 L-1 ，具体取决于跨度。

对于后跟更大编号的跨度，例如8， k = L-1 ，因为prev + 1跨度的最后一个位置必须为2,2,7（跨度后的第一个数字的值）。

对于后跟一个较小数字的跨度，例如2,2， k = L 7的最后一个元素没有特殊限制。

最后，它调用7,7,2来找出分支数（或可能性），计算新的7,7部分结果值（我们正在做的模运算中的剩余值），以及返回新的CombinationsWithReplacement值和res进行进一步处理。
res-遍历给定的Encoder Output数组。在主循环中，当跨度时，它会计算跨度长度，并在跨度边界处通过调用max更新solution并可能更新楼梯。
- 如果当前跨度包含比上一个更大的数字，则：
  1. 检查下一个号码的有效性。例如，res是无效的输入，因为倒数第二位不能是countSpan，只能是0,2,5,2,7或7或{{1} }。
  2. 它会更新楼梯。当我们仅看到3时，楼梯为4，但是在下一个5之后，楼梯变为0,2。
- 如果当前跨度由较小的数字组成，则小于前一个，则：
  1. 它更新楼梯。当我们仅看到0,2时，我们的楼梯便是5，但是当我们看到0,2,5之后，楼梯便变成了0,2,5。
- 在主循环之后，它通过使用0,2,5调用0,2,5,2来计算最后一个跨度，从而触发计算的“输出下降”分支。
0,2，countSpan，-1，normalizeMod-这些辅助功能有助于处理模运算。

_{对于楼梯，我将存储空间用于编码数组，因为楼梯的数量永远不会超过当前位置。}

extendedEuclidInternal

让我们看一个例子：

最大值= 5
  数组是
    extendedEuclid
    invMod
    #include <algorithm> #include <cassert> #include <vector> #include <tuple> const int Modulus = 1'000'000'007; int CombinationsWithReplacement(int n, int k); template <class It> auto nextBigger(It begin, It end, int value, int Max) { auto maxIt = std::upper_bound(begin, end, value); auto max = Max; if (maxIt != end) { max = *maxIt; } return max; } auto solution(std::vector<int> &B, const int Max) { auto res = 1; const auto size = (int)B.size(); auto spanLength = 1; auto prev = 0; // Stairs is the list of numbers which could be smaller than number in the next position const auto stairsBegin = B.begin(); // This includes first entry (zero) into stairs // We need to include 0 because we can meet another zero later in encoded array // and we need to be able to find in stairs auto stairsEnd = stairsBegin + 1; auto countSpan = [&](int curr) { const auto max = nextBigger(stairsBegin, stairsEnd, prev, Max); // At the moment when we switch from the current span to the next span // prev is the number from previous span and curr from current. // E.g. 1,1,7, when we move to the third position cur = 7 and prev = 1. // Observe that, in this case minimum value possible in place of any of 1's can be at least 2=1+1=prev+1. // But if we consider 7, then we have even more stringent condition for numbers in place of 1, it is 7 const auto min = std::max(prev + 1, curr); const bool countLast = prev > curr; const auto branchesCount = CombinationsWithReplacement(max - min + 1, spanLength - (countLast ? 0 : 1)); return std::make_pair(res * (long long)branchesCount % Modulus, max); }; for (int i = 1; i < size; ++i) { const auto curr = B[i]; if (curr == prev) { ++spanLength; } else { int max; std::tie(res, max) = countSpan(curr); if (prev < curr) { if (curr > max) { // 0,1,5,1,7 - invalid because number in the fourth position lies in [2,5] // and so in the fifth encoded position we can't something bigger than 5 return 0; } // It is time to possibly shrink stairs. // E.g if we had stairs 0,2,4,9,17 and current value is 5, // then we no more interested in 9 and 17, and we change stairs to 0,2,4,5. // That's because any number bigger than 9 or 17 also bigger than 5. const auto s = std::lower_bound(stairsBegin, stairsEnd, curr); stairsEnd = s; *stairsEnd++ = curr; } else { assert(curr < prev); auto it = std::lower_bound(stairsBegin, stairsEnd, curr); if (it == stairsEnd || *it != curr) { // 0,5,1 is invalid sequence because original sequence lloks like this 5,>5,>1 // and there is no 1 in any of the two first positions, so // it can't appear in the third position of the encoded array return 0; } } spanLength = 1; } prev = curr; } res = countSpan(-1).first; return res; } template <class T> T normalizeMod(T a, T m) { if (a < 0) return a + m; return a; } template <class T> std::pair<T, std::pair<T, T>> extendedEuclidInternal(T a, T b) { T old_x = 1; T old_y = 0; T x = 0; T y = 1; while (true) { T q = a / b; T t = a - b * q; if (t == 0) { break; } a = b; b = t; t = x; x = old_x - x * q; old_x = t; t = y; y = old_y - y * q; old_y = t; } return std::make_pair(b, std::make_pair(x, y)); } // Returns gcd and Bezout's coefficients template <class T> std::pair<T, std::pair<T, T>> extendedEuclid(T a, T b) { if (a > b) { if (b == 0) return std::make_pair(a, std::make_pair(1, 0)); return extendedEuclidInternal(a, b); } else { if (a == 0) return std::make_pair(b, std::make_pair(0, 1)); auto p = extendedEuclidInternal(b, a); std::swap(p.second.first, p.second.second); return p; } } template <class T> T invMod(T a, T m) { auto p = extendedEuclid(a, m); assert(p.first == 1); return normalizeMod(p.second.first, m); } int CombinationsWithReplacement(int n, int k) { int res = 1; for (long long i = n; i < n + k; ++i) { res = res * i % Modulus; } int denom = 1; for (long long i = k; i > 0; --i) { denom = denom * i % Modulus; } res = res * (long long)invMod(denom, Modulus) % Modulus; return res; } ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// // // Only the above is needed for the Codility challenge. Below is to run on the command line. // // Compile with: gcc -std=gnu++14 -lc++ -lstdc++ array_recovery.cpp // ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// #include <string.h> // Usage: 0 1 2,3, 4 M // Last arg is M, the max value for an input. // Remaining args are B (the output of the encoder) separated by commas and/or spaces // Parentheses and brackets are ignored, so you can use the same input form as Codility's tests: ([1,2,3], M) int main(int argc, char* argv[]) { int Max; std::vector<int> B; const char* delim = " ,[]()"; if (argc < 2 ) { printf("Usage: %s M 0 1 2,3, 4... \n", argv[0]); return 1; } for (int i = 1; i < argc; i++) { char* parse; parse = strtok(argv[i], delim); while (parse != NULL) { B.push_back(atoi(parse)); parse = strtok (NULL, delim); } } Max = B.back(); B.pop_back(); printf("%d\n", solution(B, Max)); return 0; } ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// // // Only the above is needed for the Codility challenge. Below is to run on the command line. // // Compile with: gcc -std=gnu++14 -lc++ -lstdc++ array_recovery.cpp // ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// #include <string.h> // Usage: M 0 1 2,3, 4 // first arg is M, the max value for an input. // remaining args are B (the output of the encoder) separated by commas and/or spaces int main(int argc, char* argv[]) { int Max; std::vector<int> B; const char* delim = " ,"; if (argc < 3 ) { printf("Usage: %s M 0 1 2,3, 4... \n", argv[0]); return 1; } Max = atoi(argv[1]); for (int i = 2; i < argc; i++) { char* parse; parse = strtok(argv[i], delim); while (parse != NULL) { B.push_back(atoi(parse)); parse = strtok (NULL, delim); } } printf("%d\n", solution(B, Max)); return 0; }
    0 1 3 0 1 1 3
    1
    1 2..5
    1 3 4..5（很抱歉，编写方法繁琐）
    1 3 4..5 1
  现在计数：
    1 3 4..5 1 2..5总计为1 3 4..5 1 2..5 >=..2。

Answer 2

这是个主意。构造输出的一种已知方法是使用堆栈。我们在元素大于或等于时弹出它，然后输出较小的元素（如果存在），然后将较大的元素压入堆栈。现在，如果我们尝试从输出中倒退该怎么办？

首先，我们将通过示例示例演示堆栈方法。

[2, 5, 3, 7, 9, 6]
2: output 0, stack [2]
5: output 2, stack [2,5]
3: pop 5, output, 2, stack [2,3]
7: output 3, stack [2,3,7]
... etc.

Final output: [0, 2, 2, 3, 7, 3]

现在让我们尝试重建！我们将stack用作虚拟堆栈和重构输入：

(Input: [2, 5, 3, 7, 9, 6])
Output: [0, 2, 2, 3, 7, 3]

* Something >3 that reached 3 in the stack
stack = [3, 3 < *]

* Something >7 that reached 7 in the stack
but both of those would've popped before 3
stack = [3, 7, 7 < x, 3 < * <= x]

* Something >3, 7 qualifies
stack = [3, 7, 7 < x, 3 < * <= x]

* Something >2, 3 qualifies
stack = [2, 3, 7, 7 < x, 3 < * <= x]

* Something >2 and >=3 since 3 reached 2
stack = [2, 2 < *, 3, 7, 7 < x, 3 < * <= x]

让我们尝试您的示例：

示例1：

[0, 0, 0, 2, 3, 4]

* Something >4
stack = [4, 4 < *]

* Something >3, 4 qualifies
stack = [3, 4, 4 < *]

* Something >2, 3 qualifies
stack = [2, 3, 4, 4 < *]

* The rest is non-increasing with lowerbound 2
stack = [y >= x, x >= 2, 2, 3, 4, >4]

示例2：

[0, 0, 0, 4]

* Something >4
stack [4, 4 < *]

* Non-increasing
stack = [z >= y, y >= 4, 4, 4 < *]

通过将所有部分的可能性相乘可以计算出组合数量。区段是有界的单个单元格；或一个或多个单元格的有界的非递增子阵列。为了计算后者，我们使用多选二项式(n + k - 1) choose (k - 1)。考虑到我们可以将3细胞的绑定的，非递增序列的细胞之间的差异表示为：

(ub - cell_3) + (cell_3 - cell_2) + (cell_2 - cell_1) + (cell_1 - lb) = ub - lb

然后将ub - lb分配到(x + 1)单元中的方式是

(n + k - 1) choose (k - 1)
or
(ub - lb + x) choose x

For example, the number of non-increasing sequences between
(3,4) in two cells is (4 - 3 + 2) choose 2 = 3: [3,3] [4,3] [4,4]

And the number of non-increasing sequences between
(3,4) in three cells is (4 - 3 + 3) choose 3 = 4: [3,3,3] [4,3,3] [4,4,3] [4,4,4]

（说明归因于Brian M. Scott。）

粗略的JavaScript草图（代码不可靠；仅用于说明编码。编码器列出[lower_bound，upper_bound]或不递增的序列为[non_inc，length，lower_bound，upper_bound]）：

function f(A, M){
  console.log(JSON.stringify(A), M);
  let i = A.length - 1;
  let last = A[i];
  let s = [[last,last]];
  if (A[i-1] == last){
    let d = 1;
    s.splice(1,0,['non_inc',d++,last,M]);
    while (i > 0 && A[i-1] == last){
      s.splice(1,0,['non_inc',d++,last,M]);
      i--
    }
  } else {
    s.push([last+1,M]);
    i--;
  }
  if (i == 0)
    s.splice(0,1);

  for (; i>0; i--){
    let x = A[i];

    if (x < s[0][0])
      s = [[x,x]].concat(s);

    if (x > s[0][0]){
      let [l, _l] = s[0];
      let [lb, ub] = s[1];
      s[0] = [x+1, M];
      s[1] = [lb, x];
      s = [[l,_l], [x,x]].concat(s);
    }

    if (x == s[0][0]){
      let [l,_l] = s[0];
      let [lb, ub] = s[1];
      let d = 1;
      s.splice(0,1);
      while (i > 0 && A[i-1] == x){
        s =
    [['non_inc', d++, lb, M]].concat(s);
        i--;
      }
      if (i > 0)
        s = [[l,_l]].concat(s);
    }
  }

  // dirty fix
  if (s[0][0] == 0)
    s.splice(0,1);

  return s; 
}

var a = [2, 5, 3, 7, 9, 6]
var b = [0, 2, 2, 3, 7, 3]
console.log(JSON.stringify(a));
console.log(JSON.stringify(f(b,10)));
b = [0,0,0,4]
console.log(JSON.stringify(f(b,10)));
b = [0,2,0,0,0,4]
console.log(JSON.stringify(f(b,10)));
b = [0,0,0,2,3,4]
console.log(JSON.stringify(f(b,10)));
b = [0,2,2]
console.log(JSON.stringify(f(b,4)));
b = [0,3,5,6]
console.log(JSON.stringify(f(b,10)));
b = [0,0,3,0]
console.log(JSON.stringify(f(b,10)));

算法概述

功能

从输出将输入重构为编码器

编辑＃2

2 个答案: