Question

这个问题一直困扰着我。我需要C中的非递归算法来生成非独特的字符串。例如，如果给定的字符串长度为26个字符，并且字符串的长度为2，则会有26^2个不同的字符。

请注意，这些是不同的组合，aab与baa或aba不同。我搜索了S.O.，大多数解决方案产生了非独特的组合。另外，我不需要排列。

算法不能依赖库。我打算将这个C代码翻译成cuda，其中标准C库不起作用（至少效率不高）。

在我向您展示我的开始之前，让我解释一下该计划的一个方面。它在GPU上是多线程的，所以我在这种情况下使用几个字符aa初始化起始字符串。要创建组合，我会根据所需的长度添加一个或多个字符。

这是我尝试过的一种方法：

int main(void){

   //Declarations
   char final[12] = {0};
   char b[3] = "aa";
   char charSet[27] = "abcdefghijklmnopqrstuvwxyz"; 
   int max = 4; //Set for demonstration purposes
   int ul = 1;
   int k,i;

   //This program is multithreaded on a GPU. Each thread is initialized 
   //to a starting value for the string. In this case, it is aa

   //Set final with a starting prefix
   int pref = strlen(b);
   memcpy(final, b, pref+1);

   //Determine the number of non-distinct combinations
   for(int j = 0; j < length; j++) ul *= strlen(charSet);

   //Start concatenating characters to the current character string
   for(k = 0; k < ul; k++)
   {
        final[pref+1] = charSet[k];
        //Do some work with the string

   }
   ...

很明显，这个程序没有任何用处，如果我只追加charSet中的一个字符，则接受。

我的教授建议我尝试使用映射（这不是作业;我问他有关生成不同组合而不递归的可能方法）。

他的建议与我上面提到的相似。使用计算的组合数，他建议根据mod 10对其进行分解。但是，我意识到它不会起作用。

例如，假设我需要附加两个字符。这使我使用上面的字符集给出了676种组合。如果我在第523组合，他演示的分解将产生

523 % 10 = 3
52 % 10 = 2
5 % 10 = 5

显然这不起作用。例如，它产生三个字符，如果我的字符集大于10个字符，则产生两个字符，映射将忽略索引9以上的字符。

尽管如此，我认为映射是解决方案的关键。

我探索的另一种方法用于循环：

//Psuedocode
c = charset;

for(i = 0; i <length(charset); i++){
    concat string

    for(j = 0; i <length(charset); i++){
          concat string

          for...

但是，这硬编码了我想要计算的字符串的长度。我可以使用带有if的{{1}}语句来破解它，但我想避免使用此方法。

赞赏任何有建设性的意见。

Answer 1

给定一个字符串，找到序列中的下一个可能的字符串：

查找字符串中不是字母表中最后一个字符的最后一个字符。
将其替换为字母表中的下一个字符。
使用字母表中的第一个字符更改该字符右侧的每个字符。

以字符串开头，该字符串是字母表中第一个字符的重复。当第1步失败时（因为字符串是字母表的最后一个字符），那么你就完成了。

示例：字母为"ajxz"。

从aaaa开始。

第一次迭代：不是z的最右边的字符是最后一个。将其更改为下一个字符：aaaj

第二次迭代。同上。 aaax

第三次迭代：再次。 aaaz

四次迭代：现在最右边的非z字符是倒数第二个。推进它并将所有字符更改为a：aaja

等

Answer 2

首先，感谢大家的投入;这很有帮助。由于我正在将此算法转换为cuda，我需要在GPU上尽可能高效。所提出的方法当然可行，但不一定是GPU架构的最佳选择。我想出了一个使用模块化算法的不同解决方案，它利用了我的字符集的基础。这是一个示例程序，主要在C中，混合C++用于输出，而且速度相当快。

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <iostream>

using namespace std;
typedef unsigned long long ull;

int main(void){

   //Declarations
   int init = 2;
   char final[12] = {'a', 'a'};

   char charSet[27] = "abcdefghijklmnopqrstuvwxyz"; 
   ull max = 2; //Modify as need be


   int base = strlen(charSet);
   int placeHolder; //Maps to character in charset (result of %)
   ull quotient;  //Quotient after division by base

   ull nComb = 1;
   char comb[max+1]; //Array to hold combinations

   int c = 0;
   ull i,j;


   //Compute the number of distinct combinations ((size of charset)^length)
   for(j = 0; j < max; j++) nComb *= strlen(charSet);


   //Begin computing combinations
   for(i = 0; i < nComb; i++){
       quotient = i;

      for(j = 0; j < max; j++){ //No need to check whether the quotient is zero
             placeHolder = quotient % base;
             final[init+j] = charSet[placeHolder]; //Copy the indicated character
             quotient /= base; //Divide the number by its base to calculate the next character
      }

      string str(final);
      c++;
      //Print combinations
      cout << final << "\n";
  }
  cout << "\n\n" << c << " combinations calculated";
  getchar();
}

非递归组合算法，用于生成不同的字符串

2 个答案: