Question

我正在编写一个读取文本文件的代码，然后计算一对字母出现的实例数。例如，包含“aabbaa”的文本文件

出现的次数是aa = 2，ab = 1，ba = 1

我以为我可以使用像这样的2D数组：

char charPair[25][25] =   {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w ','x','y','z','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'};

但这只会返回一个字母。

任何帮助将不胜感激！

Answer 1

重要提示：如果您声明char - 数组，那么如果组合发生次数超过255次，则条目将会溢出，因此我将其更改为long。

另请注意，您的2D数组应该包含您正在使用的字母表中每个字母的索引。我将假设它是26个字母（例如只有ascii小写）：

long charPair[26][26];
memset(charPair, 0, 26*26*sizeof(long));
char* reader = yourInput;
char current = *reader-'a';
++reader;
char next = *reader-'a';
while(next!=0) { // assumes \0-terminated
    charPair[current][next] += 1;
    current = next;
    next = *reader-'a';
    ++reader;
}

-'a'是这样的，字母a将有行/列0，z将有26。

编辑：关于如何最好地阅读输入的评论：上面的代码假设整个输入被放入一个字符串（\ 0终止）

FILE* f = fopen(filename, "rb"); // (todo: add your error handling if 0 returned)
fseek(f, 0, SEEK_END);
int len = ftell(f);
fseek(f, 0, SEEK_SET);
char* yourInput = malloc(len+1); // (todo: add your error handling if 0 returned)
fread(yourInput, 1, len, f); // (todo: add your error handling if <len returned)
yourInput[len] = '\0';
fclose(f);

Answer 2

在c ++'ish C中，请根据需要进行转换，变量声明，注释等......

...

char tCharPairCount[26][26]; // Lower-Case strings only
memset(tCharPairCount,0,26*26);

char tPrevChar = tempString[0];
for(int i=1; i<tempString.length(); ++i ) 
{
   char tCurrentChar = tempString[i];
   ++tCharPairCount[tPrevChar-'a'][tCurrentChar-'a'];
   tPrevChar = tCurrentChar;
}

...

//迭代结果

for(i:0->25)
for(j:0->25)
 printf("%i",tCharPairCount[i][j]);  // 0,0 => aa ; 1,0 => ba

在C中配对字符

2 个答案: