Question

我正在编写一个函数，用于查找文件中最常见的字母字符。该函数应忽略除字母之外的所有字符。

目前我有以下内容：

int most_common(const char *filename)
{
char frequency[26];
int ch = 0;

FILE *fileHandle;
if((fileHandle = fopen(filename, "r")) == NULL){
    return -1;
}

for (ch = 0; ch < 26; ch++)
    frequency[ch] = 0;

while(1){
    ch = fgetc(fileHandle);
    if (ch == EOF) break;

    if ('a' <= ch && ch  <= 'z')
        frequency[ch - 'a']++;
    else if ('A' <= ch && ch <= 'Z')
        frequency[ch - 'A']++;
}

int max = 0;
for (int i = 1; i < 26; ++i)
  if (frequency[i] > frequency[max])
      max = i;

return max;
}

现在该函数返回最频繁出现的字母的次数，而不是字符本身。我有点失落，因为我不确定这个功能应该是什么样子。它是否有意义，我怎么可能解决这个问题？

我真的很感谢你的帮助。

Answer 1

变量frequency由字符代码索引。所以frequency[0]是5，如果有5'a'。

在您的代码中，您将计数分配给max，而不是字符代码，因此您将返回计数而非实际字符。

您需要存储最大频率计数和它所引用的字符代码。

我会解决这个问题：

int maxCount = 0;
int maxChar = 0;
// i = A to Z
for (int i = 0; i <= 26; ++i)
{
  // if freq of this char is greater than the previous max freq
  if (frequency[i] > maxCount)
  {
      // store the value of the max freq
      maxCount = frequency[i];

      // store the char that had the max freq
      maxChar = i;
  }
}

// character codes are zero-based alphabet.
// Add ASCII value of 'A' to turn back into a char code.
return maxChar + 'A';

请注意，我将int i = 1更改为int i = 0。从1开始意味着从B开始，这是一个您可能不会注意到的微妙错误。此外，循环应终止于<= 26，否则您也会错过Z。

注意大括号。您的大括号样式（单语句块没有大括号）非常非常不推荐。

此外，i++在这种情况下比++i更常见。在这种情况下，它没有任何区别，所以建议i++。

在C

1 个答案: