awk的递归函数

时间:2015-08-15 00:23:49

标签: awk

给出描述父子关系的名字和姓氏的csv

java -jar selenium-server-standalone-2.46.0.jar -role node -hub http://<IP>:4444/grid/register -timeout 10000 -browserTimeout 10000 -sessionMaxIdleTimeInSeconds 10000

我想打印:

16:34:58.122 INFO - Launching a Selenium Grid node
16:34:59.982 WARN - error getting the parameters from the hub. The node may end up with wrong timeouts.Connect to <IP>:4444 [<IP>] failed: Connection refused: connect
16:35:00.029 INFO - Java: Oracle Corporation 25.51-b03
16:35:00.029 INFO - OS: Windows 8.1 6.3 amd64
16:35:00.044 INFO - v2.46.0, with Core v2.46.0. Built from revision 87c69e2
16:35:00.107 INFO - Driver class not found: com.opera.core.systems.OperaDriver
16:35:00.107 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
16:35:00.154 INFO - Version Jetty/5.1.x
16:35:00.154 INFO - Started HttpContext[/selenium-server,/selenium-server]
16:35:00.154 INFO - Started org.openqa.jetty.jetty.servlet.ServletHandler@76a4d6c
16:35:00.154 INFO - Started HttpContext[/wd,/wd]
16:35:00.154 INFO - Started HttpContext[/selenium-server/driver,/selenium-server/driver]
16:35:00.154 INFO - Started HttpContext[/,/]
16:35:00.154 INFO - Started SocketListener on 0.0.0.0:5555
16:35:00.154 INFO - Started org.openqa.jetty.jetty.Server@1f7030a6
16:35:00.154 INFO - Selenium Grid node is up and ready to register to the hub
16:35:00.185 INFO - Starting auto registration thread. Will try to register every 5000 ms.
16:35:00.200 INFO - Registering the node to the hub: http://<IP>/grid/register
16:35:01.232 INFO - Couldn't register this node: Error sending the registration request: Connect to <IP>:4444 [IP] failed: Connection refused: connect
16:35:07.232 INFO - Couldn't register this node: The hub is down or not responding: Connect to <IP>:4444 [IP] failed: Connection refused: connect

我写了一个如下脚本:

$ cat /var/tmp/hier
F2 L2,F1 L1
F3 L3,F1 L1
F4 L4,F2 L2
F5 L5,F2 L2
F6 L6,F3 L3

运行脚本:

F1 L1
    F2 L2
        F4 L4
        F5 L5
    F3 L3
        F6 L6

似乎awk命令格式错误。但如果我运行调试输出中显示的命令,它可以工作:

#!/bin/bash
print_node() {
        echo "awk -F, '\$2=="\"$@\"" {print \$1}' /var/tmp/hier"
        for node in `eval "awk -F, '\$2=="\"$@\"" {print \$1}'     /var/tmp/hier"`
        do
                echo -e "\t"$node
                print_node "$node"
        done
}
print_node "$1"

有人能指出导致此错误的原因吗?

=======================
@moderatoors: 我添加了perl和python标签,因为我欢迎perl或python代码中的解决方案。请不要冒犯。

2 个答案:

答案 0 :(得分:0)

我会亲自到达Perl这里;你也可以做Python(或者碰巧存在的任何其他类似级别的语言,比如Ruby或Tcl,但Perl和Python几乎都是普遍预装的)。我会使用其中一个,因为它们具有内置的嵌套数据结构,这使得以可导航的形式缓存树很容易,而不是每次想要获取节点的子节点时重新解析父链接。 (GNU awk有数组数组,但BSD awk没有。)

无论如何,这是一个perl解决方案:

#!/usr/bin/env perl
use strict;
use warnings;

my %parent;

while (<>) {
  chomp;
  my ($child, $parent) = split ',';
  $parent{$child} = $parent;
}

my (%children, %roots);

while (my ($child, $parent) = each %parent) {
  push @{$children{$parent} ||= []}, $child;
  $roots{$parent} = 1 unless $parent{$parent};
}

foreach my $root (sort keys %roots) {
  show($root);
}

sub show {
  my ($node, $indent) = (@_,'');
  print "$indent$node\n";
  foreach my $child (sort(@{$children{$node}||[]})) {
    show($child, "    $indent");
  }
}

我将上述内容保存为print_tree.pl并在您的数据上运行如下:

$ perl print_tree.pl *csv

你也可以使用chmod +x print_tree.pl使其可执行并在不明确调用perl的情况下运行它:

$ ./print_tree.pl *csv

无论如何,在您的样本数据上,它会产生以下输出:

F1 L1
    F2 L2
        F4 L4
        F5 L5
    F3 L3
        F6 L6

答案 1 :(得分:0)

没有多维awk数组的替代解决方案适用于此层次结构

join -t, -1 1 -2 2 inputfile{,} | awk -F, -f tree.awk

和awk脚本如下

$ cat tree.awk 
    {
            s=$1;$1=$2;$2=s;
            t=""
            for (i=1;i<=NF;i++) {
                    if (! ($i in n)) {
                            print t $i
                            n[$i]
                    }
                    t=t "\t"
            }
    }