Question

我试图找出如何知道进程的线程是否在Unix / Linux机器上死锁？此外，是否有命令知道进程处于哪个阶段（或状态）？如果您知道任何工具，请建议。谢谢。

Answer 1

感谢/proc/<pid>/syscall，这就是我最终实现quick and dirty processes futex(op=FUTEX_WAIT) scanner。

的方式

#!/bin/bash
#
# Find all processes that are executing a futex(2) call with op=FUTEX_WAIT
# In some cases this can be helpful in finding deadlock-ed processes.
#

test ! $UID -eq 0 && echo -e "WARNING: Not running as root, only processes for this user are being scanned\n" >&2;
pids=$(ps -u $UID -opid --no-headers)

for pid in $pids; do
        cat /proc/$pid/syscall |

        awk "{if (\$1 == 202 && \$3 == \"0x0\") {
                print $pid
        }}";

        # $1 is the syscall, we compare to 202 which is the futex call
        # See: /usr/include/asm/unistd.h

        # $2 is the 1st param, $3 is the 2nd param, etc
        # We compare the second param to 0x0 which is FUTEX_WAIT
        # See: /usr/include/linux/futex.h
done

Answer 2

尝试使用跟踪系统调用的工具，例如Linux上的strace或HP-UX上的tusc。发生死锁时，您应该看到进程在阻塞调用中挂起。但这不是一个积极的证据。它可能是一个常规块。然后，您需要确定是否可以在某个时间解析块。这需要了解流程正在等待的资源。

实施例

在RHEL4上存在......特殊性......可能导致ctime死锁。找到以下表现出此行为的示例程序：

#include <sys/time.h>
#include <time.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

volatile char *r;

void handler(int sig)
{
    time_t t;

    time(&t);
    r = ctime(&t);
}

int main()
{
    struct itimerval it;
    struct sigaction sa;
    time_t t;
    int counter = 0;

    memset(&sa, 0, sizeof(sa));
    sa.sa_handler = handler;
    sigaction(SIGALRM, &sa, NULL);

    it.it_value.tv_sec = 0;
    it.it_value.tv_usec = 1000;
    it.it_interval.tv_sec = 0;
    it.it_interval.tv_usec = 1000;
    setitimer(ITIMER_REAL, &it, NULL);

    while(1) {
        counter++;
        time(&t);
        r = ctime(&t);
        printf("Loop %d\n",counter);
    }

    return 0;
}

这通常会在几千次迭代后进入死锁状态。现在，像这样附上strace

strace -s4096 -p<PID>

其中PID是程序的进程ID。您会看到程序在参数中使用FUTEX_WAIT的呼叫中挂起。（我不能引用整行，因为我目前无法访问RHEL4机器，请原谅。）

Answer 3

UNIX保证OS进程永远不会陷入死锁。但是，对用户定义的流程没有这样的保证。据我所知，没有直接的方法来确定一个进程是否已陷入僵局。

也就是说，您可以通过ps -o pid,uname,command,state,stime,time确定流程状态。 man ps也会显示有关流程状态代码的更详细说明。

死锁在一个进程中，Unix命令？

3 个答案:

实施例