我希望将c#方法重构为c函数以试图获得一些速度,然后在c#中调用c dll以允许我的程序使用该功能。









我对使用c(人们已经指出的c ++)的看法是,它为性能调优提供了更多的空间。直接端口可能不会提供任何增加,但它为更多涉及的方法开辟了道路,以便从中获得更高的速度。即使每次迭代的小幅增加也等同于可测量的增长。





public List<List<int>> powerset(List<int> currentGroupList)
    _currentGroupList = currentGroupList;
    int max;
    int count;

    //Count the objects in the group
    count = _currentGroupList.Count;
    max = (int)Math.Pow(2, count);

    //outer loop
    for (int i = 0; i < max; i++)
        _currentSet = new List<int>();

        //inner loop
        for (int j = 0; j < count; j++)
            if ((i & (1 << j)) == 0)
    return outputList;





hughdbrown的实现正在实现,但我认为我需要在某个时候存储结果(或至少是它们的一部分)。听起来内存限制将在运行时成为真正问题之前很久就会应用。 部分是因为这一点,我认为我可以使用字节而不是整数,从而提供更多潜在的存储空间。


另外,确保转向C / C ++确实是您需要为速度开始做的事情。使用原始的C#方法(独立,通过单元测试执行),检测新的C / C ++方法(再次,通过单元测试独立),看看现实世界的区别。

我提出这个问题的原因是我担心它可能是一个很好的胜利 - 使用Smokey Bacon的建议,你得到你的列表类,你是“更快”的C ++,但是调用那个DLL还是有代价的:使用P / Invoke或COM interop从运行时弹出会带来相当大的性能成本。



如果你反复调用这个循环,你需要绝对确保整个循环逻辑被封装在一个互操作调用中 - 否则编组的开销(正如其他人提到的那样)肯定会杀了你。< / p>

考虑到问题的描述,我确实认为问题不在于C#/ .NET比C慢“”,而是代码需要优化的可能性更大。正如这里提到的另一张海报,您可以在C#中使用指针来严重提高此类循环的性能,而无需编组。在进入一个复杂的互操作世界之前,我会先研究一下这个场景。

如果您希望使用C来提高性能,很可能您计划通过使用指针来实现。 C#允许使用unsafe关键字来使用指针。你考虑过吗?



查看Native code without sacrificing .NET performance的一些互操作选项。有一些方法可以在没有太多性能损失的情况下进行互操作,但这些互操作只能在最简单的数据类型中发生。



此外,如果您已经开始混合使用本机代码和托管代码,我可以建议您使用c ++ / cli创建库。下面是一个简单的例子。请注意,我不是一个c ++ / cli人,这段代码没有做任何有用的事情......它只是为了表明你可以轻松地混合原生代码和托管代码。

#include "stdafx.h"

using namespace System;

System::Collections::Generic::List<int> ^MyAlgorithm(System::Collections::Generic::List<int> ^sourceList);

int main(array<System::String ^> ^args)
    System::Collections::Generic::List<int> ^intList = gcnew System::Collections::Generic::List<int>();


    Console::WriteLine("Before Call");
    for each(int i in intList)

    System::Collections::Generic::List<int> ^modifiedList = MyAlgorithm(intList);

    Console::WriteLine("After Call");
    for each(int i in modifiedList)

System::Collections::Generic::List<int> ^MyAlgorithm(System::Collections::Generic::List<int> ^sourceList)
    int* nativeInts = new int[sourceList->Count];

    int nativeIntArraySize = sourceList->Count;

    //Managed to Native
    for(int i=0; i<sourceList->Count; i++)
        nativeInts[i] = sourceList[i];

    //Do Something to native ints
    for(int i=0; i<nativeIntArraySize; i++)

    //Native to Managed
    System::Collections::Generic::List<int> ^returnList = gcnew System::Collections::Generic::List<int>();
    for(int i=0; i<nativeIntArraySize; i++)

    return returnList;

是什么让你觉得通过调用C代码可以获得速度? C并不比C#神奇地快。它当然可以,但它也可以很慢(和更慢)。特别是当您将p / invoke调用纳入本机代码时,很难确定这种方法会加速任何事情。

在任何情况下,C都没有像List这样的东西。它有原始数组和指针(你可以说int **或多或少相当),但你可能最好使用C ++,它有相同的数据结构。特别是std :: vector。 没有简单的方法可以将这些数据公开给C#,因为它几乎是随机分散的(每个列表都指向一些动态分配的内存某处



我可以看到算法中的一些内容似乎不是最理想的。构建列表列表不是免费的。也许您可以创建单个列表并使用不同的偏移量来表示每个子列表。或者也许使用'yield return'和IEnumerable而不是显式构建列表可能会更快。


尽管它“不安全”,但它并不比C / C ++更“安全”,并且它更容易做对。

一次返回一组powerset。它基于python代码here。它适用于超过32个元素的powersets。如果需要少于32,则可以将long更改为int。它非常快 - 比我之前的算法更快,并且比(我修改为使用yield return版本)P Daddy的代码更快。

static class PowerSet4<T>
    static public IEnumerable<IList<T>> powerset(T[] currentGroupList)
        int count = currentGroupList.Length;
        Dictionary<long, T> powerToIndex = new Dictionary<long, T>();
        long mask = 1L;
        for (int i = 0; i < count; i++)
            powerToIndex[mask] = currentGroupList[i];
            mask <<= 1;

        Dictionary<long, T> result = new Dictionary<long, T>();
        yield return result.Values.ToArray();

        long max = 1L << count;
        for (long i = 1L; i < max; i++)
            long key = i & -i;
            if (result.ContainsKey(key))
                result[key] = powerToIndex[key];
            yield return result.Values.ToArray();


我真的认为使用收益率回报是使计算大型数据库成为可能的变化。预先分配大量内存会大大增加运行时间,并导致算法在很早的时候因内存不足而失败。原创海报应该弄清楚他一次需要多少套动力。对于&gt; 24个元素来说,持有它们并不是一个真正的选择。

作为奖励,返回的集合处于更“自然”的顺序。它会按照您在问题中列出的顺序返回集合{1 2 3}的子集。这不是重点,但是使用算法的副作用。




 * Made it static, because it shouldn't really use or modify state data.
 * Making it static also saves a tiny bit of call time, because it doesn't
 * have to receive an extra "this" pointer.  Also, accessing a local
 * parameter is a tiny bit faster than accessing a class member, because
 * dereferencing the "this" pointer is not free.
 * Made it generic so that the same code can handle sets of any type.
static IList<IList<T>> PowerSet<T>(IList<T> set){
    if(set == null)
        throw new ArgumentNullException("set");

     * Caveat:
     * If set.Count > 30, this function pukes all over itself without so
     * much as wiping up afterwards.  Even for 30 elements, though, the
     * result set is about 68 GB (if "set" is comprised of ints).  24 or
     * 25 elements is a practical limit for current hardware.
    int   setSize     = set.Count;
    int   subsetCount = 1 << setSize; // MUCH faster than (int)Math.Pow(2, setSize)
    T[][] rtn         = new T[subsetCount][];
     * We don't really need dynamic list allocation.  We can calculate
     * in advance the number of subsets ("subsetCount" above), and
     * the size of each subset (0 through setSize).  The performance
     * of List<> is pretty horrible when the initial size is not
     * guessed well.

    int subsetIndex = 0;
    for(int subsetSize = 0; subsetSize <= setSize; subsetSize++){
         * The "indices" array below is part of how we implement the
         * "natural" ordering of the subsets.  For a subset of size 3,
         * for example, we initialize the indices array with {0, 1, 2};
         * Later, we'll increment each index until we reach setSize,
         * then carry over to the next index.  So, assuming a set size
         * of 5, the second iteration will have indices {0, 1, 3}, the
         * third will have {0, 1, 4}, and the fifth will involve a carry,
         * so we'll have {0, 2, 3}.
        int[] indices = new int[subsetSize];
        for(int i = 1; i < subsetSize; i++)
            indices[i] = i;

         * Now we'll iterate over all the subsets we need to make for the
         * current subset size.  The number of subsets of a given size
         * is easily determined with combination (nCr).  In other words,
         * if I have 5 items in my set and I want all subsets of size 3,
         * I need 5-pick-3, or 5C3 = 5! / 3!(5 - 3)! = 10.
        for(int i = Combination(setSize, subsetSize); i > 0; i--){
             * Copy the items from the input set according to the
             * indices we've already set up.  Alternatively, if you
             * just wanted the indices in your output, you could
             * just dup the index array here (but make sure you dup!
             * Otherwise the setup step at the bottom of this for
             * loop will mess up your output list!  You'll also want
             * to change the function's return type to
             * IList<IList<int>> in that case.
            T[] subset = new T[subsetSize];
            for(int j = 0; j < subsetSize; j++)
                subset[j] = set[indices[j]];

            /* Add the subset to the return */
            rtn[subsetIndex++] = subset;

             * Set up indices for next subset.  This looks a lot
             * messier than it is.  It simply increments the
             * right-most index until it overflows, then carries
             * over left as far as it needs to.  I've made the
             * logic as fast as I could, which is why it's hairy-
             * looking.  Note that the inner for loop won't
             * actually run as long as a carry isn't required,
             * and will run at most once in any case.  The outer
             * loop will go through as few iterations as required.
             * You may notice that this logic doesn't check the
             * end case (when the left-most digit overflows).  It
             * doesn't need to, since the loop up above won't
             * execute again in that case, anyway.  There's no
             * reason to waste time checking that here.
            for(int j = subsetSize - 1; j >= 0; j--)
                if(++indices[j] <= setSize - subsetSize + j){
                    for(int k = j + 1; k < subsetSize; k++)
                        indices[k] = indices[k - 1] + 1;
    return rtn;

static int Combination(int n, int r){
    if(r == 0 || r == n)
        return 1;

     * The formula for combination is:
     *       n!
     *   ----------
     *   r!(n - r)!
     * We'll actually use a slightly modified version here.  The above
     * formula forces us to calculate (n - r)! twice.  Instead, we only
     * multiply for the numerator the factors of n! that aren't canceled
     * out by (n - r)! in the denominator.

     * nCr == nC(n - r)
     * We can use this fact to reduce the number of multiplications we
     * perform, as well as the incidence of overflow, where r > n / 2
    if(r > n / 2) /* We DO want integer truncation here (7 / 2 = 3) */
        r = n - r;

     * I originally used all integer math below, with some complicated
     * logic and another function to handle cases where the intermediate
     * results overflowed a 32-bit int.  It was pretty ugly.  In later
     * testing, I found that the more generalized double-precision
     * floating-point approach was actually *faster*, so there was no
     * need for the ugly code.  But if you want to see a giant WTF, look
     * at the edit history for this post!

    double denominator = Factorial(r);
    double numerator   = n;
    while(--r > 0)
        numerator *= --n;

    return (int)(numerator / denominator + 0.1/* Deal with rounding errors. */);

 * The archetypical factorial implementation is recursive, and is perhaps
 * the most often used demonstration of recursion in text books and other
 * materials.  It's unfortunate, however, that few texts point out that
 * it's nearly as simple to write an iterative factorial function that
 * will perform better (although tail-end recursion, if implemented by
 * the compiler, will help to close the gap).
static double Factorial(int x){
     * An all-purpose factorial function would handle negative numbers
     * correctly - the result should be Sign(x) * Factorial(Abs(x)) -
     * but since we don't need that functionality, we're better off
     * saving the few extra clock cycles it would take.

     * I originally used all integer math below, but found that the
     * double-precision floating-point version is not only more
     * general, but also *faster*!

    if(x < 2)
        return 1;

    double rtn = x;
    while(--x > 1)
        rtn *= x;

    return rtn;

static class PowerSet<T>
    static long[] mask = { 1L << 0, 1L << 1, 1L << 2, 1L << 3, 
                           1L << 4, 1L << 5, 1L << 6, 1L << 7, 
                           1L << 8, 1L << 9, 1L << 10, 1L << 11, 
                           1L << 12, 1L << 13, 1L << 14, 1L << 15, 
                           1L << 16, 1L << 17, 1L << 18, 1L << 19, 
                           1L << 20, 1L << 21, 1L << 22, 1L << 23, 
                           1L << 24, 1L << 25, 1L << 26, 1L << 27, 
                           1L << 28, 1L << 29, 1L << 30, 1L << 31};
    static public IEnumerable<IList<T>> powerset(T[] currentGroupList)
        int count = currentGroupList.Length;
        long max = 1L << count;
        for (long iter = 0; iter < max; iter++)
            T[] list = new T[count];
            int k = 0, m = -1;
            for (long i = iter; i != 0; i &= (i - 1))
                while ((mask[++m] & i) == 0)
                list[k++] = currentGroupList[m];
            yield return list;


    static void Main(string[] args)
        int[] intList = { 1, 2, 3, 4 };
        foreach (IList<int> set in PowerSet<int>.powerset(intList))
            foreach (int i in set)
                Console.Write("{0} ", i);


在我的机器上,此代码比OP的代码稍慢,直到集合为16或更大。但是,16个元素的所有时间都小于0.15秒。在23个元素中,它在64%的时间内运行。原始算法不能在我的机器上运行24个或更多元素 - 它耗尽内存。

此代码需要12秒才能生成数字1到24的电源设置,省略屏幕I / O时间。那是在12秒内达到1600万,或者每秒约1400K。十亿(这是你之前引用的),那将是大约760秒。你觉得这应该花多长时间?

它必须是C,还是C ++也是一个选项?如果是C ++,您可以从STL中获得自己的list类型。否则,您将必须实现自己的列表 - 查找链接列表或动态大小的数组,以获取有关如何执行此操作的指示。

    static long Combination(long n, long r)
        r = (r > n - r) ? (n - r) : r;
        if (r == 0)
            return 1;
        long result = 1;
        long k = 1;
        while (r-- > 0)
            result *= n--;
            result /= k++;

        return result;
