对于启发式算法,我需要一个接一个地评估某个集合的组合,直到达到停止标准。
由于它们很多,目前我使用以下内存高效迭代器块生成它们(受python的itertools.combinations
启发):
public static IEnumerable<T[]> GetCombinations<T>(this IList<T> pool, int r)
{
int n = pool.Count;
if (r > n)
throw new ArgumentException("r cannot be greater than pool size");
int[] indices = Enumerable.Range(0, r).ToArray();
yield return indices.Select(idx => pool[idx]).ToArray();
while (true)
{
int i;
for (i = r - 1; i >= 0; i--)
if (indices[i] != i + n - r)
break;
if (i < 0)
break;
indices[i] += 1;
for (int j = i + 1; j < r; j++)
indices[j] = indices[j - 1] + 1;
yield return indices.Select(idx => pool[idx]).ToArray();
}
}
问题是,为了大大提高启发式的效率,我需要生成这些组合,它们按索引的总和排序(换句话说,我需要首先生成,包含集合的第一个元素的组合)。
e.g。
考虑集合S = {0,1,2,3,4,5}
(为了简单起见,我选择这个集合,因为元素和它们的索引重合)
从给定算法生成的r=4
个数字的所有可能组合是:
(0, 1, 2, 3) SUM: 6
(0, 1, 2, 4) SUM: 7
(0, 1, 2, 5) SUM: 8
(0, 1, 3, 4) SUM: 8
(0, 1, 3, 5) SUM: 9
(0, 1, 4, 5) SUM: 10
(0, 2, 3, 4) SUM: 9
(0, 2, 3, 5) SUM: 10
(0, 2, 4, 5) SUM: 11
(0, 3, 4, 5) SUM: 12
(1, 2, 3, 4) SUM: 10
(1, 2, 3, 5) SUM: 11
(1, 2, 4, 5) SUM: 12
(1, 3, 4, 5) SUM: 13
(2, 3, 4, 5) SUM: 14
其中,正如您所看到的,组合不是严格按升序排序。
预期结果如下:
(具有相同总和的组合的顺序并不重要)
(0, 1, 2, 3) SUM: 6
(0, 1, 2, 4) SUM: 7
(0, 1, 2, 5) SUM: 8
(0, 1, 3, 4) SUM: 8
(0, 1, 3, 5) SUM: 9
(0, 2, 3, 4) SUM: 9
(0, 1, 4, 5) SUM: 10
(0, 2, 3, 5) SUM: 10
(1, 2, 3, 4) SUM: 10
(0, 2, 4, 5) SUM: 11
(1, 2, 3, 5) SUM: 11
(0, 3, 4, 5) SUM: 12
(1, 2, 4, 5) SUM: 12
(1, 3, 4, 5) SUM: 13
(2, 3, 4, 5) SUM: 14
一个简单的解决方案是生成所有组合,然后根据它们的总和对它们进行排序;但这并不是真正有效/可行,因为随着n
的增长,组合的数量变得很大。
我也快速了解组合格雷码,但我找不到任何适合此问题的人。
你对如何实现这样的事情有所了解吗?
编辑:
这个问题有另一种(不幸的是不容易)的表述
给定一组S
和一个数字r
,所有可能的总和都很容易找到,因为它们只是r
的第一个S
元素总和中的所有数字} r
的最后S
个元素的总和。
话虽如此,如果,对于每个总和T
,我们可以有效地找到所有具有和T
的组合,我们解决了原始问题,因为我们只是按升序生成它们。
¹有效意味着我不想生成所有组合并丢弃具有不同总和的组合。
编辑2:
在@EricLippert建议之后,我创建了以下代码:
public static IEnumerable<T[]>
GetCombinationsSortedByIndexSum<T>(this IList<T> pool, int r)
{
int n = pool.Count;
if (r > n)
throw new ArgumentException("r cannot be greater than pool size");
int minSum = ((r - 1) * r) / 2;
int maxSum = (n * (n + 1)) / 2 - ((n - r - 1) * (n - r)) / 2;
for (int sum = minSum; sum <= maxSum; sum++)
{
foreach (var indexes in AllMonotIncrSubseqOfLenMWhichSumToN(0, n - 1, r, sum))
yield return indexes.Select(x => pool[x]).ToArray();
}
}
static IEnumerable<IEnumerable<int>>
AllMonotIncrSubseqOfLenMWhichSumToN(int seqFirstElement, int seqLastElement, int m, int n)
{
for (int i = seqFirstElement; i <= seqLastElement - m + 1; i++)
{
if (m == 1)
{
if (i == n)
yield return new int[] { i };
}
else
{
foreach (var el in AllMonotIncrSubseqOfLenMWhichSumToN(i + 1, seqLastElement, m - 1, n - i))
yield return new int[] { i }.Concat(el);
}
}
}
这很好(希望是Eric的意思:P)但是我仍然担心递归方法的复杂性。事实上,似乎我们正在重新生成每个总和的所有组合,而不是总结到所需值的那些。
为了降低内部函数的复杂性,我找到了一种通过使用有效的上限和下限来限制迭代的方法(现在很难说它的复杂性是多少)。
检查my answer以查看最终代码。
答案 0 :(得分:5)
我想到的解决方案是:
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
// Preconditions:
// * items is a sequence of non-negative monotone increasing integers
// * n is the number of items to be in the subsequence
// * sum is the desired sum of that subsequence.
// Result:
// A sequence of subsequences of the original sequence where each
// subsequence has n items and the given sum.
static IEnumerable<IEnumerable<int>> M(IEnumerable<int> items, int sum, int n)
{
// Let's start by taking some easy outs. If the sum is negative
// then there is no solution. If the number of items in the
// subsequence is negative then there is no solution.
if (sum < 0 || n < 0)
yield break;
// If the number of items in the subsequence is zero then
// the only possible solution is if the sum is zero.
if (n == 0)
{
if (sum == 0)
yield return Enumerable.Empty<int>();
yield break;
}
// If the number of items is less than the required number of
// items, there is no solution.
if (items.Count() < n)
yield break;
// We have at least n items in the sequence, and
// and n is greater than zero, so First() is valid:
int first = items.First();
// We need n items from a monotone increasing subsequence
// that have a particular sum. We might already be too
// large to meet that requirement:
if (n * first > sum)
yield break;
// There might be some solutions that involve the first element.
// Find them all.
foreach(var subsequence in M(items.Skip(1), sum - first, n - 1))
yield return new[]{first}.Concat(subsequence);
// And there might be some solutions that do not involve the first element.
// Find them all.
foreach(var subsequence in M(items.Skip(1), sum, n))
yield return subsequence;
}
static void Main()
{
int[] x = {0, 1, 2, 3, 4, 5};
for (int i = 0; i <= 15; ++i)
foreach(var seq in M(x, i, 4))
Console.WriteLine("({0}) SUM {1}", string.Join(",", seq), i);
}
}
输出是您想要的输出。
我没有尝试优化它。对它进行分析并查看大部分时间花在哪里会很有趣。
更新:为了好玩,我写了一个使用不可变堆栈而不是任意可枚举的版本。享受!
using System;
using System.Collections.Generic;
using System.Linq;
abstract class ImmutableList<T> : IEnumerable<T>
{
public static readonly ImmutableList<T> Empty = new EmptyList();
private ImmutableList() {}
public abstract bool IsEmpty { get; }
public abstract T Head { get; }
public abstract ImmutableList<T> Tail { get; }
public ImmutableList<T> Push(T newHead)
{
return new List(newHead, this);
}
private sealed class EmptyList : ImmutableList<T>
{
public override bool IsEmpty { get { return true; } }
public override T Head { get { throw new InvalidOperationException(); } }
public override ImmutableList<T> Tail { get { throw new InvalidOperationException(); } }
}
private sealed class List : ImmutableList<T>
{
private readonly T head;
private readonly ImmutableList<T> tail;
public override bool IsEmpty { get { return false; } }
public override T Head { get { return head; } }
public override ImmutableList<T> Tail { get { return tail; } }
public List(T head, ImmutableList<T> tail)
{
this.head = head;
this.tail = tail;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
for (ImmutableList<T> current = this; !current.IsEmpty; current = current.Tail)
yield return current.Head;
}
}
class Program
{
// Preconditions:
// * items is a sequence of non-negative monotone increasing integers
// * n is the number of items to be in the subsequence
// * sum is the desired sum of that subsequence.
// Result:
// A sequence of subsequences of the original sequence where each
// subsequence has n items and the given sum.
static IEnumerable<ImmutableList<int>> M(ImmutableList<int> items, int sum, int n)
{
// Let's start by taking some easy outs. If the sum is negative
// then there is no solution. If the number of items in the
// subsequence is negative then there is no solution.
if (sum < 0 || n < 0)
yield break;
// If the number of items in the subsequence is zero then
// the only possible solution is if the sum is zero.
if (n == 0)
{
if (sum == 0)
yield return ImmutableList<int>.Empty;
yield break;
}
// If the number of items is less than the required number of
// items, there is no solution.
if (items.Count() < n)
yield break;
// We have at least n items in the sequence, and
// and n is greater than zero.
int first = items.Head;
// We need n items from a monotone increasing subsequence
// that have a particular sum. We might already be too
// large to meet that requirement:
if (n * first > sum)
yield break;
// There might be some solutions that involve the first element.
// Find them all.
foreach(var subsequence in M(items.Tail, sum - first, n - 1))
yield return subsequence.Push(first);
// And there might be some solutions that do not involve the first element.
// Find them all.
foreach(var subsequence in M(items.Tail, sum, n))
yield return subsequence;
}
static void Main()
{
ImmutableList<int> x = ImmutableList<int>.Empty.Push(5).
Push(4).Push(3).Push(2).Push(1).Push(0);
for (int i = 0; i <= 15; ++i)
foreach(var seq in M(x, i, 4))
Console.WriteLine("({0}) SUM {1}", string.Join(",", seq), i);
}
}
答案 1 :(得分:1)
为了完整和清晰起见,我将发布我的最终代码:
// Given a pool of elements returns all the
// combinations of the groups of lenght r in pool,
// such that the combinations are ordered (ascending) by the sum of
// the indexes of the elements.
// e.g. pool = {A,B,C,D,E} r = 3
// returns
// (A, B, C) indexes: (0, 1, 2) sum: 3
// (A, B, D) indexes: (0, 1, 3) sum: 4
// (A, B, E) indexes: (0, 1, 4) sum: 5
// (A, C, D) indexes: (0, 2, 3) sum: 5
// (A, C, E) indexes: (0, 2, 4) sum: 6
// (B, C, D) indexes: (1, 2, 3) sum: 6
// (A, D, E) indexes: (0, 3, 4) sum: 7
// (B, C, E) indexes: (1, 2, 4) sum: 7
// (B, D, E) indexes: (1, 3, 4) sum: 8
// (C, D, E) indexes: (2, 3, 4) sum: 9
public static IEnumerable<T[]>
GetCombinationsSortedByIndexSum<T>(this IList<T> pool, int r)
{
int n = pool.Count;
if (r > n)
throw new ArgumentException("r cannot be greater than pool size");
int minSum = F(r - 1);
int maxSum = F(n) - F(n - r - 1);
for (int sum = minSum; sum <= maxSum; sum++)
{
foreach (var indexes in AllSubSequencesWithGivenSum(0, n - 1, r, sum))
yield return indexes.Select(x => pool[x]).ToArray();
}
}
// Given a start element and a last element of a sequence of consecutive integers
// returns all the monotonically increasing subsequences of length "m" having sum "sum"
// e.g. seqFirstElement = 1, seqLastElement = 5, m = 3, sum = 8
// returns {1,2,5} and {1,3,4}
static IEnumerable<IEnumerable<int>>
AllSubSequencesWithGivenSum(int seqFirstElement, int seqLastElement, int m, int sum)
{
int lb = sum - F(seqLastElement) + F(seqLastElement - m + 1);
int ub = sum - F(seqFirstElement + m - 1) + F(seqFirstElement);
lb = Math.Max(seqFirstElement, lb);
ub = Math.Min(seqLastElement - m + 1, ub);
for (int i = lb; i <= ub; i++)
{
if (m == 1)
{
if (i == sum) // this check shouldn't be necessary anymore since LB/UB should automatically exclude wrong solutions
yield return new int[] { i };
}
else
{
foreach (var el in AllSubSequencesWithGivenSum(i + 1, seqLastElement, m - 1, sum - i))
yield return new int[] { i }.Concat(el);
}
}
}
// Formula to compute the sum of the numbers from 0 to n
// e.g. F(4) = 0 + 1 + 2 + 3 + 4 = 10
static int F(int n)
{
return (n * (n + 1)) / 2;
}