Question

我想不出有任何方法可以在c中实现真正有效的流水线操作。这就是我决定写在这里的原因。我不得不说，我明白管道/前叉/ mkfifo是如何工作的。我见过很多实现2-3个管道的例子。这很简单。我的问题开始了，当我必须实现shell时，管道计数是未知的。

我现在得到了什么：例如

ls -al | tr a-z A-Z | tr A-Z a-z | tr a-z A-Z

我将这样的线转换成类似的东西：

array[0] = {"ls", "-al", NULL"}
array[1] = {"tr", "a-z", "A-Z", NULL"}
array[2] = {"tr", "A-Z", "a-z", NULL"}
array[3] = {"tr", "a-z", "A-Z", NULL"}

所以我可以使用

execvp(array[0],array)

稍后。

现在不知道，我相信一切都好。当我尝试将这些函数输入/输出重定向到彼此时，问题就开始了。

以下是我的表现：

    mkfifo("queue", 0777);

    for (i = 0; i<= pipelines_count; i++) // eg. if there's 3 pipelines, there's 4 functions to execvp
    {
    int b = fork();             
    if (b == 0) // child
        {           
        int c = fork();

        if (c == 0) 
        // baby (younger than child) 
        // I use c process, to unblock desc_read and desc_writ for b process only
        // nothing executes in here
            {       
            if (i == 0) // 1st pipeline
                {
                int desc_read = open("queue", O_RDONLY);
                // dup2 here, so after closing there's still something that can read from 
                // from desc_read
                dup2(desc_read, 0); 
                close(desc_read);           
                }

            if (i == pipelines_count) // last pipeline
                {
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 0);
                close(desc_write);                              
                }

            if (i > 0 && i < pipelines_count) // pipeline somewhere inside
                {
                int desc_read = open("queue", O_RDONLY);
                int desc_write = open("queue", O_WRONLY);
                dup2(desc_write, 1);
                dup2(desc_read, 0);
                close(desc_write);
                close(desc_read);
                }               
            exit(0); // closing every connection between process c and pipeline             
            }
        else
        // b process here
        // in b process, i execvp commands
        {                       
        if (i == 0) // 1st pipeline (changing stdout only)
            {   
            int desc_write = open("queue", O_WRONLY);               
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            close(desc_write);                  
            }

        if (i == pipelines_count) // last pipeline (changing stdin only)
            {   
            int desc_read = open("queue", O_RDONLY);                                    
            dup2(desc_read, 0); // changing stdin -> pdesc[0]   
            close(desc_read);           
            }

        if (i > 0 && i < pipelines_count) // pipeline somewhere inside
            {               
            int desc_write = open("queue", O_WRONLY);       
            dup2(desc_write, 1); // changing stdout -> pdesc[1]
            int desc_read = open("queue", O_RDONLY);                            
            dup2(desc_read, 0); // changing stdin -> pdesc[0]
            close(desc_write);
            close(desc_read);                               
            }

        wait(NULL); // it wait's until, process c is death                      
        execvp(array[0],array);         
        }
        }
    else // parent (waits for 1 sub command to be finished)
        {       
        wait(NULL);
        }       
    }

感谢。

Answer 1

Patryk，你为什么要使用fifo，而且管道的每个阶段使用相同的fifo？

在我看来，你需要在每个阶段之间使用管道。所以流程将是这样的：

Shell             ls               tr                tr
-----             ----             ----              ----
pipe(fds);
fork();  
close(fds[0]);    close(fds[1]);
                  dup2(fds[0],0); 
                  pipe(fds);
                  fork();         
                  close(fds[0]);   close(fds[1]);  
                  dup2(fds[1],1);  dup2(fds[0],0);
                  exex(...);       pipe(fds);
                                   fork();     
                                   close(fds[0]);     etc
                                   dup2(fds[1],1);
                                   exex(...);

在每个分叉shell（close，dup2，pipe等）中运行的序列看起来像一个函数（获取所需进程的名称和参数）。请注意，在每个exec调用之前，shell的分叉副本正在运行。

编辑：

Patryk：

Also, is my thinking correct? Shall it work like that? (pseudocode): 
start_fork(ls) -> end_fork(ls) -> start_fork(tr) -> end_fork(tr) -> 
start_fork(tr) -> end_fork(tr)

我不确定你的意思是start_fork和end_fork。您是否暗示ls在tr开始之前完成运行？这不是上图所示的真正含义。在启动ls之前，您的shell不会等待tr完成。它按顺序启动管道中的所有流程，为每个流程设置stdin和stdout，以便流程链接在一起，stdout ls到{{ 1 {} stdin; tr的{{1}} stdout tr stdin。这就是dup2调用正在做的事情。

进程运行的顺序由操作系统（调度程序）决定，但显然如果tr运行并从空tr读取它必须等待（阻塞）直到previous进程将一些内容写入管道。 stdin很可能在ls从tr读取之前完成，但同样可能不会。{1}}。例如，如果链中的第一个命令是连续运行并且沿途产生输出的东西，那么管道中的第二个命令将不时被安排到任何沿管道发送的任何内容。

希望澄清一点： - ）

Answer 2

使用libpipeline可能值得。它会照顾您的所有工作，甚至可以在您的管道中包含功能。

Answer 3

问题是你想要立即做所有事情。将其分解为更小的步骤。

1）解析你的输入以获得ls -al |。 1a）从此您知道您需要创建一个管道，将其移动到stdout，然后启动ls -al。然后将管道移动到stdin。当然还有更多，但你还没有在代码中担心它。

2）解析下一个片段以获得tr a-z A-Z |。只要你的next-to-spawn命令的输出被传送到某个地方，就回到步骤1a。

Answer 4

在C中实现流水线操作。这样做的最佳方法是什么？

这个问题有点老了，但这是一个从未提供的答案。使用libpipeline。 libpipeline是一个管道操作库。用例是man页维护者之一，他们不得不频繁使用以下命令（并解决相关的OS错误）：

zsoelim < input-file | tbl | nroff -mandoc -Tutf8

这是libpipeline的方式：

pipeline *p;
int status;

p = pipeline_new ();
pipeline_want_infile (p, "input-file");
pipeline_command_args (p, "zsoelim", NULL);
pipeline_command_args (p, "tbl", NULL);
pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
status = pipeline_run (p);

libpipeline主页上有更多示例。该库还包含在许多发行版中，包括Arch，Debian，Fedora，Linux from Scratch和Ubuntu。

在C中实现流水线操作。最好的方法是什么？

4 个答案: