看看锯齿状阵列中的每个组合

时间:2016-02-18 22:31:36

标签: c# arrays

在具有未知行数和每行中未知列数的锯齿状数组中循环遍历每个可能元素组合的最快方法是什么?

这个阵列......

char[][] myArray = new char[][]{
    new char[] {'A', 'B'},
    new char[] {'C', 'D'},
    new char[] {'E', 'F'}
};

...将返回组合ACE,ACF,ADE,ADF,BCE,BCF,BDE和BDF。

使用C#实现此目的的最快方法是什么?

2 个答案:

答案 0 :(得分:2)

这是IMO一个具有最小分配的好算法(避免字符串连接):

public static class Algorithms
{
    public static IEnumerable<string> GetCombinations(this char[][] input)
    {
        var result = new char[input.Length]; 
        var indices = new int[input.Length];
        for (int pos = 0, index = 0; ;)
        {
            for (; pos < input.Length; pos++, index = 0)
            {
                indices[pos] = index;
                result[pos] = input[pos][index];
            }
            yield return new string(result);
            do
            {
                if (pos == 0) yield break;
                index = indices[--pos] + 1;
            }
            while (index >= input[pos].Length);
        }
    }
}

用法:

char[][] myArray = new char[][]{
    new char[] {'A', 'B'},
    new char[] {'C', 'D'},
    new char[] {'E', 'F'}
};
var combinations = myArray.GetCombinations();

基本上它是这样的

的展开实现
from c1 in input[0]
from c2 in input[1]
...
from cN in input[N]
select new string(new [] { c1, c2, ..., cN })

P.S 如果您确实需要char[]类型结果,只需将签名更改为

public static IEnumerable<char[]> GetCombinations(this char[][] input)

并从new string中移除yield

但请注意,在这种情况下,如果需要存储组合数组,则可枚举的使用者应负责制作组合数组的副本。从公共API设计的角度来看,产生共享的内部变异数组是坏的(邪恶的),但对于内部性能场景是完美的。

更新:由于问题与性能有关,我做了一个测试,将上述算法(A)的字符串版本与Enigmativity answer的LINQ解决方案进行比较(B)。我已经在VS之外的Release build exe中对不同数量的26个字母组合字母进行了运行,结果如下:

A: N=2 Count=         676 Time=00:00:00.0010139 Memory=     16K
B: N=2 Count=         676 Time=00:00:00.0042598 Memory=    233K

A: N=3 Count=      17,576 Time=00:00:00.0004310 Memory=    348K
B: N=3 Count=      17,576 Time=00:00:00.0126294 Memory=  2,185K

A: N=4 Count=     456,976 Time=00:00:00.0111155 Memory=  1,496K
B: N=4 Count=     456,976 Time=00:00:00.4019500 Memory=  2,104K

A: N=5 Count=  11,881,376 Time=00:00:00.2813208 Memory=  1,995K
B: N=5 Count=  11,881,376 Time=00:00:13.4492150 Memory=  2,014K

A: N=6 Count= 308,915,776 Time=00:00:07.5473890 Memory=  2,059K
B: N=6 Count= 308,915,776 Time=00:07:37.2985051 Memory=    455K

以下是有兴趣的完整测试代码:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Globalization;
using System.Linq;
using System.Threading;

namespace Samples
{
    public static class Algorithms
    {
        public static IEnumerable<string> GetCombinationsA(this char[][] input)
        {
            var result = new char[input.Length];
            var indices = new int[input.Length];
            for (int pos = 0, index = 0; ;)
            {
                for (; pos < input.Length; pos++, index = 0)
                {
                    indices[pos] = index;
                    result[pos] = input[pos][index];
                }
                yield return new string(result);
                do
                {
                    if (pos == 0) yield break;
                    index = indices[--pos] + 1;
                }
                while (index >= input[pos].Length);
            }
        }

        public static IEnumerable<string> GetCombinationsB(this char[][] input)
        {
            Func<IEnumerable<IEnumerable<char>>, IEnumerable<IEnumerable<char>>> combine = null;
            combine = css =>
                        from c in css.First()
                        from cs in css.Skip(1).Any()
                            ? combine(css.Skip(1))
                            : new[] { Enumerable.Empty<char>() }
                        select new[] { c }.Concat(cs);
            return combine(input).Select(c => String.Join("", c));
        }
    }

    class Program
    {
        class Algorithm
        {
            public string Name;
            public Func<char[][], IEnumerable<string>> Func;
        }

        static void Main(string[] args)
        {
            Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture;
            Algorithm[] algorithms =
            {
                new Algorithm { Name = "A", Func = Algorithms.GetCombinationsA },
                new Algorithm { Name = "B", Func = Algorithms.GetCombinationsB },
            };
            char[][] myArray = { new char[] {'A', 'B'}, new char[] {'C', 'D'}, new char[] {'E', 'F'} };
            foreach (var algo in algorithms)
                algo.Func(myArray);
            var chars = Enumerable.Range('A', 'Z' - 'A' + 1).Select(c => (char)c).ToArray();
            for (int n = 2; n < 7; n++)
            {
                var input = Enumerable.Range(0, n).Select(_ => chars).ToArray();
                foreach (var algo in algorithms)
                    Test(algo, input);
                Console.WriteLine();
            }
            Console.WriteLine("Done.");
            Console.ReadLine();
        }

        static void Test(Algorithm algo, char[][] input)
        {
            GC.Collect(); GC.WaitForPendingFinalizers();
            GC.Collect(); GC.WaitForPendingFinalizers();
            var totalMem = GC.GetTotalMemory(false);
            var timer = Stopwatch.StartNew();
            long count = 0;
            foreach (var comb in algo.Func(input)) count++;
            timer.Stop();
            totalMem = GC.GetTotalMemory(false) - totalMem;
            Console.WriteLine($"{algo.Name}: N={input.Length} Count={count,12:n0} Time={timer.Elapsed} Memory={totalMem / 1024,7:n0}K");
        }
    }
}

答案 1 :(得分:1)

这非常有用:

Func<IEnumerable<IEnumerable<char>>, IEnumerable<IEnumerable<char>>> combine = null;
combine = css =>
            from c in css.First()
            from cs in css.Skip(1).Any()
                ? combine(css.Skip(1)) 
                : new [] { Enumerable.Empty<char>() }
            select new [] { c }.Concat(cs);

如果我将运行数据的结果转换为字符串:

var result =
    String.Join(
        ", ",
        combine(myArray)
            .Select(c => String.Join("", c)));

...然后我得到了这个结果:ACE, ACF, ADE, ADF, BCE, BCF, BDE, BDF

这计算得非常快,但知道现实世界的输入是否足够快就会很有趣。