C字数 - 处理段落

时间:2016-02-03 19:46:19

标签: c

这是我目前的代码:

unsigned long charcount = 0;
unsigned long wordcount = 0;
unsigned long linecount = 0;
int n;

for (; (n = getchar()) != EOF; ++charcount) {
    if (n == '\n')
        ++linecount;
    if (n == ' ' || n == '\n' || n == '\t')
        ++wordcount;
    printf("%lu %lu %lu\n", charcount, wordcount, linecount);
}

我认为这个代码存在一个问题,如果我正在阅读的文本文件由段落组成,因为换行符将它们分开,它们将被视为单词。我不确定如何修复它以便它们不算作单词。

3 个答案:

答案 0 :(得分:2)

使用指示符来指定您是否正在阅读单词或空格:

int isWord = 0;
while ((n = getchar()) != EOF) {
    if (isspace(n)) {
        if (n == '\n') ++linecount;
        if (isWord) {
           ++wordcount;
           isWord = 0;
        }
    }
    else {
        isWord = 1;
    }
}
if (isWord)
    ++wordcount;

答案 1 :(得分:1)

考虑使用这些定义:

行开头:当前字符是第一个或上一个是'\n' 单词开头:当前字符不是空格,也不是第一个或前一个字符是分隔符(空格)。

此方法检测行/单词的开头

unsigned long charcount = 0;
unsigned long wordcount = 0;
unsigned long linecount = 0;
int previous = '\n';
int n;

while ((n = getchar()) != EOF) {
  ++charcount;
  if (isspace(previous)) {
    if (!issspace(n))  ++wordcount;    // Beginning of word detected
    if (previous == '\n') ++linecount; // Beginning of line detected
  }
  previous = n;
}
printf("%lu %lu %lu\n", charcount, wordcount, linecount);

这种方法运作良好,包括以下条件:

  1. 多个空格被视为单个空格(分隔符)。

  2. 以空格开头的文件,不会丢弃字数。

  3. 以空格或不以空格结尾的文件不会丢弃字数。

  4. 最后一行不必以'\n'结尾。

  5. 零长度文件不是问题。

  6. 调整EOF次数不需要line/word个帖子。

  7. word/line/char以外的ULONG_MAX长度限制。

  8. OP代码详情

    for (; (n = getchar()) != EOF; ++charcount) {
    
        // This fails to count the last line of a file should it lack a \n
        if (n == '\n')
            ++linecount;
    
        // This counts separator (white-space) occurrence.
        // Multiple spaces count as 2 words: not good
        // Files like "Hello" will count as 0 words: not good
        // Files like " Hello " will count as 2 words: not good
        if (n == ' ' || n == '\n' || n == '\t')
            ++wordcount;
    
        // Using `unsigned long` is good, maybe even `unsigned long long`.
        printf("%lu %lu %lu\n", charcount, wordcount, linecount);
    }
    

    OP它没有得到足够的"单词"。我们假设非字母是有效的单词分隔符。

    unsigned long charcount = 0;
    unsigned long wordcount = 0;
    unsigned long linecount = 0;
    int previous = '\n';
    int n;
    
    while ((n = getchar()) != EOF) {
      ++charcount;
      if (!isalpha(previous)) {
        if (previous == '\n') ++linecount; // Beginning of line detected
        if (isalpha(n))  ++wordcount;    // Beginning of word detected
      }
      previous = n;
    }
    printf("%lu %lu %lu\n", charcount, wordcount, linecount);
    

答案 2 :(得分:0)

尝试将前一个char存储在变量中并进行比较。

unsigned long int charcount = 0;
unsigned long int wordcount = 0;
unsigned long int linecount = 0;
int n;
int prev='\n';

while ((n = getchar()) != EOF){
    charcount++;
    if (n == '\n' && prev != '\n'){
        linecount++;
    }
    if (n == ' ' || n == '\n' || n == '\t')
        wordcount++;
    }
    prev=n
}
printf( "%lu %lu %lu\n", charcount, wordcount, linecount );