perl - 如何使用正则表达式捕获空单元格

时间:2015-04-02 17:20:07

标签: regex perl

输出:

id | status   | name             | cluster | ip          | mac               | roles | pending_roles   | online
---|----------|------------------|---------|-------------|-------------------|-------|-----------------|-------
11 | discover | Untitled (9a:3a) | 12      | 10.20.0.144 | c8:1f:66:ce:9a:3a |       | cinder          | True
12 | discover | Untitled (9f:8d) | 12      | 10.20.0.186 | c8:1f:66:ce:9f:8d |       | cinder, compute | True
10 | discover | Untitled (c7:f3) | None    | 10.20.0.214 | c8:1f:66:ce:c7:f3 |       |                 | True
13 | discover | Untitled (9f:3d) | None    | 10.20.0.233 | c8:1f:66:ce:9f:3d |       |                 | True
8  | discover | Untitled (74:8e) | 12      | 10.20.0.184 | c8:1f:66:ce:74:8e |       | controller      | True
14 | discover | Untitled (75:4b) | None    | 10.20.0.185 | c8:1f:66:ce:75:4b |       |                 | True
9  | discover | Untitled (76:23) | None    | 10.20.0.213 | c8:1f:66:ce:76:23 |       |                 | True

我的正则表达式:

\d+)\s+\|\s+(\w+)\s+\|\s+\w+\s+\((\S+)\)\s+\|\s+(\d+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|(.*?)\|(.*?)\|\s+(\w+)

但是无法抓住空的细胞!我尝试了很多方法。

行示例:

13 | discover | Untitled (9f:3d) | None    | 10.20.0.233 | c8:1f:66:ce:9f:3d |       |                 | True 

3 个答案:

答案 0 :(得分:4)

chomp( my $header = <> );
chomp( my $sep    = <> );

my $pat =
   join ' x3 ',
      map "A".(length($_)-2),
         "-$sep-" =~ /(-+)/g;

my @headers = unpack($pat, $header);
while (my $line = <>) {
   my %row; @row{@headers} = unpack($pat, $line);

   # Do whatever here.
   print("Row id=$row{id} has no pending roles\n")
      if !length($row{pending_roles});
}

输出:

Row id=10 has no pending roles
Row id=13 has no pending roles
Row id=14 has no pending roles
Row id=9 has no pending roles

答案 1 :(得分:1)

不要试图将结构化数据视为非结构化线条。您有管道分隔的数据,因此将其解析为管道分隔的数据,然后检查您已解析的内容。

请注意,我在单个单元格上使用正则表达式(/^\s*$/来查看它是否都是空格),但不是在每一行上。

以下是一个例子:

#!/usr/bin/perl

use strict;
use warnings;

while ( my $line = <DATA> ) {
    chomp $line;
    my @cells = split /\|/, $line, -1;
    my $ncells = scalar @cells;
    die "There should be 9 fields, but line $. has $ncells" unless $ncells == 9;
    for my $i ( 1 .. $ncells ) {
        if ( $cells[$i-1] =~ /^\s*$/ ) {
            print "Cell #$i on line $. is empty\n";
        }
    }
}

__DATA__
id | status   | name             | cluster | ip          | mac               | roles | pending_roles   | online
---|----------|------------------|---------|-------------|-------------------|-------|-----------------|-------
11 | discover | Untitled (9a:3a) | 12      | 10.20.0.144 | c8:1f:66:ce:9a:3a |       | cinder          | True
12 | discover | Untitled (9f:8d) | 12      | 10.20.0.186 | c8:1f:66:ce:9f:8d |       | cinder, compute | True
10 | discover | Untitled (c7:f3) | None    | 10.20.0.214 | c8:1f:66:ce:c7:f3 |       |                 | True
13 | discover | Untitled (9f:3d) | None    | 10.20.0.233 | c8:1f:66:ce:9f:3d |       |                 | True
8  | discover | Untitled (74:8e) | 12      | 10.20.0.184 | c8:1f:66:ce:74:8e |       | controller      | True
14 | discover | Untitled (75:4b) | None    | 10.20.0.185 | c8:1f:66:ce:75:4b |       |                 | True
9  | discover | Untitled (76:23) | None    | 10.20.0.213 | c8:1f:66:ce:76:23 |       |                 | True

答案 2 :(得分:0)

如果您必须使用正则表达式,请尽量使其尽可能小。还假设您的数据中没有任何内容或任何内容..

my $r = 0; 
foreach my $row (@rows) { 
    my $c = 0; 
    print "Row $r\n"; 
    while($row =~ /([^|])*(\||$)/g) { 
        my $col = $1;
        print "    $c: $col\t"; 
        if ($col =~ /^\s+$/) { print "whitespace only!" }
        print "\n"; 
        $c++;
    }  
    $r++;
}