Question

我有一个awk脚本来打印出现在myfilename中的pids。其中myfilename包含一个pid列表，每个pid出现在一个新行...

ps -eaf | awk -f script.awk myfilename -

以下是script.awk的内容......

# process the first file on the command line (aka myfilename)
# this is the list of pids
ARGIND == 1 {
    pids[$0] = 1
}

# second and subsequent files ("-"/stdin in the example)
ARGIND > 1 {
    # is column 2 of the ps -eaf output [i.e.] the pid in the list of desired
    # pids? -- if so, print the entire line
    if ($2 in pids)
        printf("%s\n",$0)
}

目前，comman按照ps -eaf命令的顺序打印出pid但是我希望按照它们在myfilename中出现的顺序打印出pid。

我试图修改脚本以循环$ pids并重复相同的逻辑，但我无法完全正确。

如果有人可以帮助我，请欣赏它。

感谢

Answer 1

原谅我生锈的AWK。也许这有用吗？

ARGIND == 1 {
    pids[$0] = NR # capture the order
}

ARGIND > 1 {
    if ($2 in pids) {
        idx = pids[$2];
        matches[idx] = $0; # capture the line and associate it with the ps -eaf order
        if (idx > max)
            max = idx;
    }
}

END {
    for(i = 1; i <= max; i++)
        if (i in matches)
            print matches[i];
}

我不知道ps -eaf的输出是什么样的，或者什么假设可能对其输出有用。当我第一次阅读问题时，我认为OP对脚本有两个以上的输入。如果它真的只有两个那么反转输入可能更有意义，如果不是那么这可能是更通用的方法。

Answer 2

我宁愿使用历史悠久的NR==FNR构造来做到这一点。它有点像这样（单线）。

ps -eaf | awk 'NR==FNR{p[$1]++;next} $2 in p' mypidlist -

NR==FNR的想法是我们查看当前记录号（NR），并将其与当前文件（FNR）中的记录号进行比较。如果它们是相同的，我们在同一个文件中，所以我们存储一条记录并移动到下一行输入。

如果NR==FNR 不为真，那么我们只需检查数组中的$2。

因此，第一个表达式使用p[]的内容填充数组mypidlist，第二个结构只是一个条件，默认为{print}作为其语句。

当然，上面的单行代码并不能满足您按照pid输入文件的顺序打印结果的要求。为此，您需要保留索引并将数据记录在数组中以进行某种排序。当然，它不必是真正的排序，只需保持索引本身就足够了。以下是一个单行的有点长：

ps -eaf | awk 'NR==FNR{p[$1]++;o[++n]=$1;next} $2 in p {c[$2]=$0} END {for(n=1;n<=length(o);n++){print n,o[n],c[o[n]]}}' mypidlist -

为便于阅读而破解，awk脚本如下所示：

# Record the pid list... NR==FNR { p[$1]++ # Each pid is an element in this array. o[++n]=$1 # This array records the order of the pids. next } # If the second+ input source has a matching pid... $2 in p { c[$2]=$0 # record the line in a third array, pid as key. } END { # At the end of our input, step through the ordered pid list... for (n=1;n<=length(o);n++) { print c[o[n]] # and print the collected line, using our pid index as key. } }

请注意，如果ps输出中缺少列表中的pid，结果将是打印一个空行，因为awk并不抱怨对不存在的数组索引的引用。

另请注意，length(arrayname)符号在GAWK和OneTrueAwk中有效，但可能不是通用符号。如果这对你不起作用，你可以在你的awk脚本中添加这样的东西：

function alength(arrayname, i, n) { for(i in arrayname) n++ return n }

Answer 3

如果有一个文件，您可以按顺序翻转输入顺序并使用惯用语awk

$ awk 'NR==1; NR==FNR{a[$2]=$0;next} $0 in a{print a[$0]}' <(ps -eaf) <(seq 10)

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 02:36 ?        00:00:03 /sbin/init
root         2     0  0 02:36 ?        00:00:00 [kthreadd]
root         3     2  0 02:36 ?        00:00:00 [ksoftirqd/0]
root         4     2  0 02:36 ?        00:00:00 [kworker/0:0]
root         5     2  0 02:36 ?        00:00:00 [kworker/0:0H]
root         6     2  0 02:36 ?        00:00:00 [kworker/u30:0]
root         7     2  0 02:36 ?        00:00:00 [rcu_sched]
root         8     2  0 02:36 ?        00:00:00 [rcuos/0]
root         9     2  0 02:36 ?        00:00:00 [rcuos/1]
root        10     2  0 02:36 ?        00:00:00 [rcuos/2]

此处，seq提供的ID列表，替换为您的文件。

修改awk脚本以添加循环逻辑

3 个答案: