Question

我有一个文件，在文件中有一些看起来像这样的块（在程序的这一点上，在变量中）。

Vlan2 is up, line protocol is up
  ....
     reliability 255/255, txload 1/255, rxload 1/255^M
  ....
  Last clearing of "show interface" counters 49w5d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  ....
  L3 out Switched: ucast: 17925 pkt, 23810209 bytes mcast: 0 pkt, 0 bytes
     33374 packets input, 13154058 bytes, 0 no buffer
     Received 926 broadcasts (0 IP multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     3094286 packets output, 311981311 bytes, 0 underruns
     0 output errors, 0 interface resets
     0 output buffer failures, 0 output buffers swapped out

这是第二个块，向您展示块如何稍微变化：

port-channel86 is down (No operational members)
  ...
  reliability 255/255, txload 1/255, rxload 1/255
  ...
  Last clearing of "show interface" counters 31w2d
  ...
  RX
    147636 unicast packets  0 multicast packets  0 broadcast packets
    84356 input packets  119954232 bytes
    0 jumbo packets  0 storm suppression packets
    0 runts  0 giants  0 CRC  0 no buffer
    0 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    147636 unicast packets  0 multicast packets  0 broadcast packets
    84356 output packets  119954232 bytes
    0 jumbo packets
    0 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  0 output discard
    0 Tx pause
  0 interface resets

我想从每个块中挑选出某些数据元素，每个块中可能存在也可能不存在。例如，在我发布的第一个块中，我可能想知道有0个runts，0个输入错误和0个溢出。在第二个块中，我可能想知道有0个jumbo数据包，冲突等。如果给定的查询不在块中，则只返回na是可以接受的，因为这是为了统一处理。

每个块的结构与我发布的两个块的结构类似;分隔某些条目的换行符和空格，逗号分隔其他条目。

我对如何运作有一些想法。我不知道Perl中是否有任何“回顾”功能，但我可以尝试查找字段名称（runts，“输入错误”等），然后获取前一个整数;这似乎是最优雅的解决方案，但我不确定它是否可能。

目前，我在Perl中这样做。我正在处理的每个“块”实际上是这些块中的几个（由双换行分隔）。它不必在单个正则表达式中完成;我相信它可以通过每个块应用几个正则表达式来完成。性能不是一个真正的因素，因为这个脚本每小时运行一次。

我的目标是以自动方式将所有这些文件转换为.csv文件（或其他一些易于乱写的数据格式）。

有什么想法吗？

编辑：我提到的CSV格式的示例输出，它将逐行写入（对于这样的多个条目）作为最终结果的文件。如果在块中找不到特定条目，则在相应的行中将其标记为na：

interface_name,txload,rxload,last_clearing,input_queue,output_drops,runts,....
vlan2,1,1,49w5d,0-75-0-0,0,0,....
port-channel86,1,1,31w2d,na,na,0,...

Answer 1

属性和数字的简单哈希。

sub extract {
    my ($block) = @_;
    my %r;
    while ($block =~ /(?<num>\d+) \s (?<name>[A-Za-z\s]+)/gmsx) {
        my $name = $+{name};
        my $num = $+{num};
        $name =~ s/\A \s+//msx;
        $name =~ s/\s+ \z//msx;
        $r{$name} = $num;
    }
    return %r;
}

my $block = <<'';
Vlan2 is up, line protocol is up
⋮

my $block2 = <<'';
port-channel86 is down (No operational members)
⋮

use Data::Dumper qw(Dumper);
print Dumper {extract $block};
print Dumper {extract $block2};

Answer 2

我不认为一个正则表达式可以做到这一点，如果可能，我也不想支持它。

使用多个正则表达式，您可以轻松使用以下内容：

(\d+) runts
(\d+) input errors
...etc...

一个简单的属性名称数组和一个循环可以很快解决这个问题，而且很少有。

如果你可以通过一些预处理将输入删除到较小的块，那么你就不太可能得到误报。

Answer 3

这是在awk中执行此操作的一种方法，但这需要大量的调整才能完美。但是，再次使用SNMP。

awk '{
    printf $1
    for (i=1;i<=NF;i++) {
        if ($i" "$(i+1)~/Input queue:/) printf ",%s",$(i+2)
        if ($i~/runts/) printf ",%s",$(i-1)
        if ($i~/multicast,/) printf ",%s",$(i-1)
    }
    print ""
}' RS="swapped out" file

在Perl中解析（部分）非均匀文本块

3 个答案: