从文本文件中拆分长句子阅读

时间:2018-08-08 15:33:30

标签: c text

我想根据任意切入点将文本中的长句子拆分为较小的句子。我的方法考虑使用空格来计数单词。给定具有内容的输入文件input.txt

ciao
ciao ciao
ciao ciao ciao ciao ciao ciao
ciao ciao ciao ciao
ciao ciao ciao

我希望:

ciao
ciao ciao
ciao ciao ciao 
ciao ciao ciao
ciao ciao ciao 
ciao
ciao ciao ciao

截止点3

我用以下代码解决了这个问题:

#include<stdlib.h>
#include<stdio.h>
#include<ctype.h>                                      

/* MAIN */

int main(int argc, char *argv[]){

        FILE *inp = fopen(argv[1], "r");
        char c;
        int word_counter = 0;

        while((c = fgetc(inp)) != EOF){

                printf("%c", c);

                if(isspace(c))
                        ++word_counter;
                /* Cutter */
                if(word_counter == 3){
                        printf("\n");
                        word_counter = 0;  /* counter to zero */
                } 
        }

        return 0;
}

返回,作为输出:

ciao

ciao  ciao

ciao  ciao  ciao

我无法理解这种行为的原因。满足条件时,代码是否应该仅打印额外的换行符?为什么跳过整个句子?

1 个答案:

答案 0 :(得分:1)

读取换行符后,您需要将word_counter重置为零。

另外,如果c!= 3,则每个word_counter打印两次:

printf("%c", c);  // ** here

if(isspace(c))
        ++word_counter;
/* Cutter */
if(word_counter == 3){
        printf("\n");
        word_counter = 0;
}
else
        printf("%c", c);  // ** and here

也许可以试试看(未测试):

while((c = fgetc(inp)) != EOF){

    if (isspace(c) && ++word_counter == 3 ) {
            printf("\n");
            word_counter = 0;  /* counter to zero */
            continue;
    } 
    if (c == '\n') {
        word_counter = 0;
    }
    printf("%c", c);
}

更短:

while((c = fgetc(inp)) != EOF){

    if ( (isspace(c) && ++word_counter == 3) || (c == '\n') ) {
            printf("\n");
            word_counter = 0;  /* counter to zero */
            continue;
    } 
    printf("%c", c);
}

还要记住,如果trueisspace(c)将返回c == '\n',因此也可以处理\r\n的更健壮的版本将是:

while((c = fgetc(inp)) != EOF){

    if ( (c == ' ' || c == '\t') && (++word_counter == 3) ) {
        word_counter = 0;
        printf("\n");
        continue;
    }
    if ( c == '\r' || c == '\n' ) {
        word_counter = 0;
    }
    printf("%c", c);
}