Question

我正在开发一个简单的shell程序，一个命令行解释器，并且想逐行从文件中读取输入，因此我使用了getline（）函数。第一次，程序可以正常运行，但是，当到达文件末尾而不是终止时，该程序从头开始读取文件，并且可以无限运行。以下是主要函数中与getline（）相关的一些代码：

int main(int argc,char *argv[]){
    int const IN_SIZE = 255;
    char *input = NULL;
    size_t len = IN_SIZE;
    // get file address
    fileAdr = argv[2];

    // open file
    srcFile = fopen(fileAdr, "r");

    if (srcFile == NULL) {
        printf("No such file!\n");
        exit(-1);
    }

    while (getline( &input, &len, srcFile) != -1) {
        strtok(input, "\n");
        printf("%s\n", input);
        // some code that parses input, firstArgs == input
        execSimpleCmd(firstArgs);            
    }
    fclose(srcFile);
}

我在程序中使用fork（），很可能会导致此问题。

void execSimpleCmd(char **cmdAndArgs) {

    pid_t pid = fork();
    if (pid < 0) {
        // error
        fprintf(stderr, "Fork Failed");
        exit(-1);
    } else if (pid == 0) {
        // child process
        if (execvp(cmdAndArgs[0], cmdAndArgs) < 0) {
            printf("There is no such command!\n");
        }
        exit(0);
    } else {
        // parent process
        wait(NULL);
        return;
    }
}

此外，有时程序会读取并打印多行的组合。例如，如果输入文件如下：

ping
ww    
ls
ls -l
pwd

它会打印pwdg，pwdww等内容。如何解决？

Answer 1

似乎在某些情况下关闭FILE会将底层文件描述符查找回应用程序实际读取到的位置，从而有效地消除了读取缓冲的影响。这很重要，因为父级和子级的OS级别文件描述符指向相同的文件描述，尤其是指向相同的文件偏移量。

POSIX description of fclose()具有以下短语：

[CX] [Option Start]如果该文件尚未在EOF上并且可以搜索，则基础打开文件描述的文件偏移应设置为流的文件位置（如果流是基础文件描述的活动句柄）。

（CX means an extension to the ISO C standard和exit()当然会在所有流上运行fclose()。）

我可以使用该程序（在Debian 9.8上）重现奇怪的行为：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char *argv[]){
    FILE *f;
    if ((f = fopen("testfile", "r")) == NULL) {
        perror("fopen");
        exit(1);
    }

    int right = 0;
    if (argc > 1)
        right = 1;

    char *line = NULL;
    size_t len = 0;
    // first line 
    getline(&line, &len, f);
    printf("%s", line);

    pid_t p = fork();
    if (p == -1) {
        perror("fork");
    } else if (p == 0) {
        if (right)
            _exit(0);  // exit the child 
        else
            exit(0);   // wrong way to exit
    } else {
        wait(NULL);  // parent
    }

    // rest of the lines
    while (getline(&line, &len, f) > 0) {
        printf("%s", line);
    }

    fclose(f);
}

然后：

$ printf 'a\nb\nc\n' > testfile
$ gcc -Wall -o getline getline.c
$ ./get
getline   getline2  
$ ./getline
a
b
c
b
c

使用strace -f ./getline运行它可以清楚地表明孩子正在向后寻找文件描述符：

clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f63794e0710) = 25117
strace: Process 25117 attached
[pid 25116] wait4(-1,  <unfinished ...>
[pid 25117] lseek(3, -4, SEEK_CUR)      = 2
[pid 25117] exit_group(1)               = ?

（我没有看到不包含分叉代码的搜索，但我不知道为什么。）

因此，发生的情况是主程序上的C库从文件中读取了一块数据，然后应用程序打印了第一行。在派生之后，孩子退出，并寻找fd返回到应用程序级文件指针所在的位置。然后，父级继续，处理其余的读取缓冲区，并在完成后继续从文件读取。由于已回溯了文件描述符，因此从第二个开始的行再次可用。

在您的情况下，每次迭代重复使用fork()似乎会导致无限循环。

在子级中使用_exit()代替exit()可以解决问题在这种情况下，因为_exit()仅退出进程，因此不执行任何操作使用stdio缓冲区进行内务处理。

使用_exit()，也不会刷新任何输出缓冲区，因此您需要在fflush()和要写入的任何其他文件上手动调用stdout。

但是，如果您以相反的方式进行操作，并且孩子读取和缓冲的内容多于其处理的内容，那么对于孩子来说，找回fd很有用，以便父母可以从孩子实际离开的地方继续

另一种解决方案是不将stdio与fork()混合使用。

使用fork（）时，getline（）反复读取文件

1 个答案: