Question

我在尝试生成给定字符集的所有可能字符串时遇到困难。让S成为符号集。我需要处理S长n的所有可能组合。例如，如果S={'a','b','+','-'}和n=4算法应处理以下序列：

aaaa
aaab
abab
+aa-
// And all other sequences in the universe

目前我的算法是下面描述的非有效递归算法。我有两个问题：

是否有更快的算法？
是否有并行算法来解决这个问题？

当前实施:(简化）

void workhorse(vector<char> &input, vector<char>::iterator i)
{
    if(i==inputs.end()) {
        // process the input
        return;
    }
    else {
        for( const auto& symbol : S) {
            *i=symbol;
            workhorse(input, i+1);
        }
    }
}

Answer 1

您需要所有排列组合吗？从您的示例看起来像组合（顺序无关紧要，符号可以重复）但在您的代码中看起来您可能正在尝试生成排列（通过函数名称猜测）。组合是一个简单的基数n计数 - 在这种情况下4 ^ 4和排列将少得多4！但是，如果要保留字典顺序，则递归算法稍微复杂一些。无论哪种方式，算法都是计算机科学的基本支柱，并且已经很好地覆盖了，尝试这些其他的Q：

Generating all Possible Combinations

Generate list of all possible permutations of a string

Answer 2

您的算法看起来非常有效，您不会浪费任何工作。唯一可能稍微改进的是由于递归导致的函数调用开销。但递归很好，因为它允许轻松并行化：

#include <thread>
#include <array>
#include <string>
#include <vector>
using namespace std;

array<char,3> S = {{ 'a', 'b', 'c' }};
const int split_depth = 2;
void workhorse(string& result, int i) {
    if (i == result.size()) {
        // process the input
        return;
    }
    if (i == split_depth) {
        vector<thread> threads;
        for (char symbol : S) {
            result[i] = symbol;
            threads.emplace_back([=] {
                string cpy(result);
                workhorse(cpy, i + 1);
            });
        }
        for (thread& t: threads) t.join();
    } else {
        for (char symbol : S) {
            result[i] = symbol;
            workhorse(result, i + 1);
        }
    }
}

int main() {
    string res(6, 0);
    workhorse(res, 0);
}

确保使用C ++ 11功能和启用线程编译它，例如

$ g++ -O3 -std=c++11 -lpthread [file].cpp

此版本的函数将按顺序枚举长度为split_depth的所有前缀，然后生成一个线程以进一步处理每个前缀。因此它将总共启动|S|^split_depth个线程，您可以调整它们以匹配您的硬件并发性。

Answer 3

你可以迭代地做。但它不会快得多。

想象一下，集合中的字符是数字。

＆＃39;一个＆＃39; = 0，＆＃39; b＆＃39; = 1，＆＃39; +＆＃39; = 2，＆＃39; - ＆＃39; = 3

从0000开始，然后递增直到达到3333.

0000， 0001， 0002， 0003， 0010， 0011，等等...

这很容易并行化。对于两个线程，让第一个执行从0000到1333的工作，另一个执行从2000到3333.显然，这可以很容易地扩展到任意数量的线程。

没有更多事情要做。如果你的程序很慢，因为有很多组合可供选择。通过此代码查找所有组合所需的时间线性地取决于存在的组合数。

并行算法产生一组的所有可能序列

3 个答案: