Question

我有一个案例，当Tcl脚本运行一个进程，fork()，使分叉进程运行，然后主进程退出。您可以通过运行任何分支到后台的程序（例如gvim）来尝试它，前提是它被配置为在执行后在后台运行：set res [exec gvim]。

主要过程理论上立即退出，子进程在后台运行，但不知何故主进程挂起，不退出，保持僵尸状态（在<defunct>输出中报告为ps）

在我的情况下，我开始打印的过程，我想要的东西，我希望过程退出，我说它完成了。问题是，如果我使用open "|gvim" r生成进程，那么我也无法识别进程完成的时刻。即使程序变成僵尸，fd返回的[open]也不会报告[eof]。当我尝试[read]时，只是为了阅读流程可能会打印的所有内容，它会完全挂起。

更有趣的是，主要进程和分叉进程偶尔会打印一些东西，当我尝试使用[gets]读取它时，我得到了两者。如果我过早关闭描述符，则[close]会因管道损坏而引发异常。可能这就是[read]永远不会结束的原因。

我需要一些方法来识别主进程退出的那一刻，而这个过程可能会产生另一个子进程，但是这个子进程可能完全分离了，我对它不感兴趣确实。我希望在退出之前主进程打印出来的东西，脚本应该继续工作，而后台运行的进程也在运行，我不感兴趣它会发生什么。

我可以控制我正在开始的流程来源。是的，我在signal(SIGCLD, SIG_IGN)之前做过fork() - 没有帮助。

Answer 1

您的守护程序还可以调用Found和setsid()来启动新会话并从进程组中分离。但这些也无助于解决您的问题。

您必须进行一些流程管理：

setpgrp()

编辑：另一个不合理的答案

如果分叉的子进程关闭了打开的通道，Tcl将不会等待它。

测试程序：

#!/usr/bin/tclsh

proc waitpid {pid} {
  set rc [catch {exec -- kill -0 $pid}]
  while { $rc == 0 } {
    set ::waitflag 0
    after 100 [list set ::waitflag 1]
    vwait ::waitflag
    set rc [catch {exec -- kill -0 $pid}]
  }
}

set pid [exec ./t1 &]
waitpid $pid
puts "exit tcl"
exit

测试Tcl程序：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>

int
main (int argc, char *argv [])
{
  int   pid;
  FILE  *o;

  signal (SIGCHLD, SIG_IGN);
  pid = fork ();
  if (pid == 0) {
    /* should also call setsid() and setpgrp() to daemonize */
    printf ("child\n");
    fclose (stdout);
    fclose (stderr);
    sleep (10);
    o = fopen ("/dev/tty", "w");
    fprintf (o, "child exit\n");
    fclose (o);
  } else {
    printf ("parent\n");
    sleep (2);
  }
  printf ("t1 exit %d\n", pid);
  return 0;
}

Answer 2

Tcl清除后台进程中的僵尸，调用 next 时调用exec。由于僵尸实际上并没有使用太多资源（只是进程表中的一个条目;没有其他任何东西真的存在），所以没有特别急于清理它们。

您对管道的问题是您没有将其置于非阻塞模式。要检测管道的退出，最好使用fileevent，当从管道或读取一个字节（或更多）时将触发>当管道的另一端关闭时。要区分这些情况，您必须实际尝试阅读，如果您过度阅读并且您未处于非阻塞模式，则可能会阻止。但是，Tcl很容易使用非阻塞模式。

set pipeline [open |… "r"] fileevent $pipeline readable [list handlePipeReadable $pipeline] fconfigure $pipeline -blocking false proc handlePipeReadable {pipe} { if {[gets $pipe line] >= 0} { # Managed to actually read a line; stored in $line now } elseif {[eof $pipe]} { # Pipeline was closed; get exit code, etc. if {[catch {close $pipe} msg opt]} { set exitinfo [dict get $opt -errorcode] } else { # Successful termination set exitinfo "" } # Stop the waiting in [vwait], below set ::donepipe $pipeline } else { # Partial read; things will be properly buffered up for now... } } vwait ::donepipe

请注意，在管道中使用gvim比通常更复杂，因为它是用户与之交互的应用程序。

如果您的Tcl版本已启用线程并且已安装exec软件包，您可能会发现在单独的线程中运行简单Thread会更容易。（如果你使用的是8.6，那应该就是这种情况，但我不知道这是不是真的。）

package require Thread set runner [thread::create { proc run {caller targetVariable args} { set res [catch { exec {*}$args } msg opt] set callback [list set $targetVariable [list $res $msg $opt]] thread::send -async $caller $callback } thread::wait }] proc runInBackground {completionVariable args} { global runner thread::send -async $runner [list run [thread::id] $completionVariable {*}$args] } runInBackground resultsVar gvim … # You can do other things at this point # Wait until the variable is set (by callback); alternatively, use a variable trace vwait resultsVar # Process the results to extract the sense lassign $resultsVar res msg opt puts "code: $res" puts "output: $msg" puts "status dictionary: $opt"

尽管如此，对于像gvim这样的编辑器，我实际期望它在前台运行（不需要像复杂程度那样复杂），因为只有一个他们真的可以同时与特定的终端互动。

Answer 3

首先你说：

我需要一些方法来识别主进程退出的时刻，而这个进程可能会产生另一个子进程，但是这个子进程可能完全分离，我对它的作用不感兴趣。

稍后你会说：

如果分叉的子进程关闭了打开的通道，Tcl将不会等待它。

这是两个相互矛盾的陈述。一方面，你只对父进程感兴趣，另一方面，无论孩子是否已经完成，甚至认为你也表示你对分离的子进程感兴趣。最后我听说分叉和关闭孩子的父母stdin，stdout和stderr的副本正在分离（即通过儿童程序）。我编写了这个快速程序来运行上面包含的简单c程序，正如预期的那样，tcl对子进程一无所知。我调用了程序/ tmp / compile / chuck的编译版本。我没有gvim所以我使用了emacs，但由于emacs不生成文本，我将exec包装在自己的tcl脚本中并执行。在这两种情况下，等待父进程并检测到eof。当父对象退出Runner :: getData并且评估清理时。

#!/bin/sh
exec /opt/usr8.6.3/bin/tclsh8.6  "$0" ${1+"$@"}

namespace eval  Runner {
    variable close
    variable watch
    variable lastpid ""
    array set close {}
    array set watch {}


    proc run { program { message "" }  } {
        variable watch
        variable close
        variable lastpid
        if { $message ne "" } {
            set fname "/tmp/[lindex $program 0 ]-[pid].tcl" 
            set out [ open $fname "w" ]
            puts $out "#![info nameofexecutable]"
            puts $out " catch { exec $program } err "
            puts $out "puts \"\$err\n$message\""
            close $out
            file attributes $fname -permissions 00777
            set fd [ open "|$fname " "r" ]
            set close([pid $fd]) "file delete -force $fname "
        } else {
            set fd [ open "|$program" "r" ]
            set close([pid $fd]) "puts \"cleanup\""
        } 
        fconfigure $fd -blocking 0 -buffering none
        fileevent $fd  readable [ list Runner::getData [ pid $fd ] $fd ]
    }

    proc getData { pid chan } {
        variable watch
        variable close
        variable lastpid
        set data [read $chan]
        append watch($pid)  "$data"
        if {[eof $chan]} {
            catch { close $chan }
            eval $close($pid) ; # cleanup
            set lastpid $pid
        }
    }
}
Runner::run /tmp/compile/chuck ""
Runner::run emacs   " Emacs complete"

while { 1 } {
    vwait Runner::lastpid
    set p $Runner::lastpid
    catch { exec ps -ef | grep chuck } output
    puts "program with pid $p just  ended" 
    puts "$Runner::watch($p)"
    puts " processes that match chuck "
    puts "$output" 
}

输出：请注意，在孩子报告它正在退出之后，我退出了emacs。

 [user1@linuxrocks workspace]$ ./test.tcl
 cleanup
 program with pid 27667 just  ended
 child
 parent
 t1 exit 27670
  processes that match chuck  avahi      936     1  0  2016 ? 
   00:04:35 avahi-daemon: running [linuxrocks.local] admin    27992     1  0
   19:37 pts/0    00:00:00 /tmp/compile/chuck admin    28006 27988  0
   19:37 pts/0    00:00:00 grep chuck

 child exit
 program with pid 27669 just  ended

  Emacs complete

Answer 4

好的，经过长时间的讨论，我找到了解决方案：

https://groups.google.com/forum/#!topic/comp.lang.tcl/rtaTOC95NJ0

以下脚本演示了如何解决此问题：

#!/usr/bin/tclsh 

lassign [chan pipe] input output 
chan configure $input -blocking no -buffering line ;# just for a case :) 

puts "Running $argv..." 
set ret [exec {*}$argv 2>@stderr >@$output] 
puts "Waiting for finished process..." 
set line [gets $input] 
puts "FIRST LINE: $line" 
puts "DONE. PROCESSES:" 
puts [exec ps -ef | grep [lindex $argv 0]] 
puts "EXITING."

剩下的唯一问题是仍然无法知道进程已退出，但是下一个[exec]（在这种特殊情况下可能是[exec ps...]命令执行此操作）清理僵尸（没有通用的方法 - 你可以在POSIX系统上做的最好的是[exec /bin/true]）。在我的情况下，我得到父流程必须打印的一行就足够了，之后我可以简单地＃34;让它去＃34;。

但是，如果[exec]能够以某种方式返回第一个进程的PID并且有一个标准的[wait]命令可以阻塞，直到进程退出或检查其运行状态，那将会很好（此命令目前在TclX中可用）。

请注意，[chan pipe]仅在Tcl 8.6中可用，您可以选择使用TclX中的[pipe]。

如果进程分叉和退出，Tcl [exec]进程会离开僵尸

4 个答案: