如何使用awk重复一行?

时间:2014-10-05 18:08:05

标签: regex bash awk grep copybook

我试图在OCCURS字中重复该行块,以及该行中被提及的次数。要重复的行块在行的开头有一个较小的数字。

我的意思是,通过这个输入:

01  PATIENT-TREATMENTS.
   05  PATIENT-NAME                PIC X(30).
   05  PATIENT-SS-NUMBER           PIC 9(9).
   05  NUMBER-OF-TREATMENTS        PIC 99 COMP-3.
   05  TREATMENT-HISTORY OCCURS 2.
       10  TREATMENT-DATE OCCURS 3.
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
       10  TREATING-PHYSICIAN       PIC X(30).
       10  TREATMENT-CODE           PIC 99.
   05 HELLO PIC X(9).
   05 STACK OCCURS 2.
       10 OVERFLOW PIC X(99).

这将是输出:

01  PATIENT-TREATMENTS.
   05  PATIENT-NAME                PIC X(30).
   05  PATIENT-SS-NUMBER           PIC 9(9).
   05  NUMBER-OF-TREATMENTS        PIC 99 COMP-3.
   05  TREATMENT-HISTORY OCCURS 2.
       10  TREATMENT-DATE OCCURS 3.
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
       10  TREATING-PHYSICIAN       PIC X(30).
       10  TREATMENT-CODE           PIC 99.
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
           15  TREATMENT-DAY        PIC 99.
           15  TREATMENT-MONTH      PIC 99.
           15  TREATMENT-YEAR       PIC 9(4).
       10  TREATING-PHYSICIAN       PIC X(30).
       10  TREATMENT-CODE           PIC 99.
   05 HELLO PIC X(9).
   05 STACK OCCURS 2.
       10 OVERFLOW PIC X(99).
       10 OVERFLOW PIC X(99).

我通过这种方式尝试过:

tac input.txt |
awk '
BEGIN {
 lbuff="";
 n=0;
}{

  if($0 ~ /^\s*$/) {next;}

  if ($3 == "OCCURS") {
    lev_oc=$1
    len_oc=$4
    lstart=0

    for (x=1; x<n; x++) {
      split(saved[x],saved_level," ")
      if (saved_level[1] <= lev_oc) {
        print saved[x]
        lstart=x+1
      }
    }

    for (i=1; i<=len_oc; i++) {
      for (x=lstart; x<n; x++) {
        print saved[x]
      }
    }

    print $0

  }else if ($0) {
    saved[n]=$0
    n++
  }

}' | tac

但我没有得到我想要获得的结果。 awk是最好的方法吗?你有其他选择吗?

2 个答案:

答案 0 :(得分:3)

我之所以使用perl是因为它很容易制造出任意复杂的数据结构:

#!/usr/bin/perl
use strict;
use warnings;

# read the file into an array of lines.
open my $f, '<', shift;
my @lines = <$f>;
close $f;

my @occurring;
my @occurs;

# iterate over the lines of the file
for (my $i = 0; $i < @lines; $i++) {
    # extract the "level", the first word of the line
    my $level = (split ' ', $lines[$i])[0];

    # if this line contains the OCCURS string,
    # push some info onto a stack.
    # This marks the start of something to be repeated
    if ($lines[$i] =~ /OCCURS (\d+)/) {
        push @occurring, [$1-1, $level, $i+1];
        next;
    }

    # if this line is at the same level as the level of the start of the
    # last seen item on the stack, mark the last line of the repeated text
    if (@occurring and $level eq $occurring[-1][1]) {
        push @occurs, [@{pop @occurring}, $i-1];
    }
}

# If there's anything open on the stack, it ends at the last line
while (@occurring) {
    push @occurs, [@{pop @occurring}, $#lines];
}

# handle all the lines to be repeated by appending them to the last
# line of the repetition
for (@occurs) {
    my $repeated = "";
    my ($count, undef, $start, $stop) = @$_;
    $repeated .= join "", @lines[$start..$stop] for (1..$count);
    $lines[$stop] .= $repeated;
}

print @lines;

为了您的阅读乐趣,这是一个awk翻译。

BEGIN {
    s = 0
    f = 0
}

function stack2frame(lineno) {
    f++
    frame[f,"reps"] = stack[s,"reps"]
    frame[f,"start"] = stack[s,"start"]
    frame[f,"stop"] = lineno
    s--
}

{ 
    lines[NR] = $0
    level = $1
}

# if this line contains the OCCURS string, push some info onto a stack.
# This marks the start of something to be repeated
$(NF-1) == "OCCURS" {
    s++
    stack[s,"reps"] = $NF-1
    stack[s,"level"] = level
    stack[s,"start"] = NR+1
    next
}

# if this line is at the same level as the level of the start of the
# last seen item on the stack, mark the last line of the repeated text
level == stack[s,"level"] {
    stack2frame(NR-1)
}

END {
    # If there's anything open on the stack, it ends at the last line
    while (s) {
        stack2frame(NR)
    }

    # handle all the lines to be repeated by appending them to the last
    # line of the repetition
    for (i=1; i<=f; i++) {
        repeated = ""
        for (j=1; j <= frame[i,"reps"]; j++) {
            for (k = frame[i,"start"]; k <= frame[i,"stop"]; k++) {
                repeated = repeated ORS lines[k]
            }
        }
        lines[frame[i,"stop"]] = lines[frame[i,"stop"]] repeated
    }

    for (i=1; i <= NR; i++) 
        print lines[i]
}

答案 1 :(得分:0)

这是一个红宝石解决方案:

#!/usr/bin/env ruby
# -*- coding: utf-8 -*-
stack = []
def unwind_frame(stack)
    frame = stack.pop
    _,occurs,data = *frame
    with_each = stack==[] ? ->(l){ puts l} : ->(l){stack.last[2].push l}
    occurs.times { data.each &with_each }
end
while gets
  $_.chomp! "\n"
  if m=$_.match(/OCCURS ([0-9]*)\.\s*$/)
    puts $_
    occurs=m[1].to_i
    level = $_.to_i
    stack.push([level,occurs,[]])
    next
  end
  if stack==[]; puts $_; next; end
  level = $_.to_i
  if level > stack.last[0]
    stack.last[2].push $_
    next
  end
  while(stack!=[] && level <= stack.last[0])
    unwind_frame(stack)
    stack!=[] ? stack.last[2].push($_) : puts($_)
  end
end
while(stack!=[])
  unwind_frame(stack)
end

结果符合您的预期。