给定一个包含内容的文件:
insert_job: J1
insert_job: J2
box_name: J1
insert_job: J3
box_name: J2
insert_job: J4
box_name: J1
insert_job: J5
box_name: J4
insert_job: J6
box_name: J4
我想将其显示如下(使用标签来识别父母与子女的关系):
J1
J2
J3
J4
J5
J6
test_data2 for Borodin: ------------------------------ insert_job: JS11-LR_BaselIII insert_job: JS11-Check_Batch_Run_Numbers box_name: JS11-LR_BaselIII insert_job: 11000000-start box_name: JS11-Check_Batch_Run_Numbers insert_job: 11000000-runbox box_name: JS11-Check_Batch_Run_Numbers insert_job: JS11-Load_Session_Date box_name: JS11-LR_BaselIII insert_job: JS110000-start box_name: JS11-Load_Session_Date insert_job: JS110000-runbox box_name: JS11-Load_Session_Date insert_job: JS11-Start_RiskWatch box_name: JS11-LR_BaselIII insert_job: JS110004-start box_name: JS11-Start_RiskWatch insert_job: JS110004-runbox box_name: JS11-Start_RiskWatch insert_job: JS11-Start_UDS box_name: JS11-LR_BaselIII insert_job: JS110001-start box_name: JS11-Start_UDS insert_job: JS110001-runbox box_name: JS11-Start_UDS insert_job: JS11-Pool_Processing box_name: JS11-LR_BaselIII insert_job: JS110002-start box_name: JS11-Pool_ProcessingEd的解决方案中的
sdpvvrsp810{alelai}: gawk -f tst.awk testjobs3
gawk: tst.awk:2: /^box_name/ { box = $2; jobs[box][job] }
gawk: tst.awk:2: ^ syntax error
gawk: tst.awk:9: for (job in jobs[box])
gawk: tst.awk:9: ^ syntax error
答案 0 :(得分:1)
这是一个更短的perl版本,可以处理您的示例数据。
sub parse {
local $/ = undef;
my $text = <>;
my ($root) = $text =~ /insert_job:\s*(\S+)/;
my @links = $text =~ /insert_job:\s*(\S+)\s*box_name:\s*(\S+)/g;
my $children = {};
while (@links) {
my $child = shift @links;
my $parent = shift @links;
push @{$children->{$parent}}, $child;
}
my $print = sub {
my ($print, $parent, $indent) = @_;
print "\t" x $indent, $parent, "\n";
$print->($print, $_, $indent + 1) foreach (@{$children->{$parent} || []});
};
$print->($print, $root, 0);
}
parse;
答案 1 :(得分:1)
这个程序可以满足您的要求。它期望输入文件的路径作为命令行上的参数。
首先构建一个哈希,将每个作业的名称与该框中的所有作业相关联。在下一行中未跟随框名称的作业将被推送到根作业列表中。最后,调用递归子例程print_tree
以从每个根开始转储依赖树。
use strict;
use warnings;
my ($curr_job, %jobs, @roots);
while (<>) {
next unless my ($op, $id) = /(\w+): ([\w-]+)/;
if ($op eq 'insert_job') {
push @roots, $curr_job if $curr_job;
$curr_job = $id;
$jobs{$id} = [] unless $jobs{$id};
}
elsif ($op eq 'box_name') {
push @{ $jobs{$id} }, $curr_job;
$curr_job = undef;
}
}
push @roots, $curr_job if $curr_job;
print_tree($_) for @roots;
sub print_tree {
my ($root, $indent) = (@_, 0);
printf "%s%s\n", ' ' x 4 x $indent, $root;
print_tree($_, $indent + 1) for @{ $jobs{$root} };
}
<强>输出强>
J1
J2
J3
J4
J5
J6
输出2
JS11-LR_BaselIII
JS11-Check_Batch_Run_Numbers
11000000-runbox
11000000-start
JS11-Load_Session_Date
JS110000-runbox
JS110000-start
JS11-Pool_Processing
JS110002-start
JS11-Start_RiskWatch
JS110004-runbox
JS110004-start
JS11-Start_UDS
JS110001-runbox
JS110001-start
答案 2 :(得分:0)
将GNU awk用于真正的多维数组:
$ cat tst.awk
/^insert_job/ { job = $2; if (root == "") root = job }
/^box_name/ { box = $2; jobs[box][job] }
END { prtBox(root) }
function prtBox(box, job) {
printf "%*s%s\n", indent, "", box
indent += 2
if (box in jobs)
for (job in jobs[box])
prtBox(job)
indent -= 2
}
$ awk -f tst.awk file
J1
J2
J3
J4
J5
J6