使用perl将格式化文本转换为XML(或)JSON

时间:2015-09-11 11:38:41

标签: perl

我需要使用Perl将格式化文本解析为XML。 它是.cfg文件,其中包含格式化文本。

file ::

的某些部分
system
    name "NILpv-BNG34"
    contact "Wayne Ritchie/NGN OPERATIONS 0800 4 NGNOP (0800 464 667)"
    location "NIL, Level 1, Tory Street, Wellington."
    clli-code "BNG_v15    "
    chassis-mode c
    dns
    exit
    persistence
        subscriber-mgmt
            location cf2:
        exit
    exit
    snmp
        streaming
            no shutdown
        exit
        packet-size 9216
    exit
    time
        ntp
            server 10.78.247.155 prefer
            no shutdown
        exit
        sntp
            shutdown
        exit
        dst-zone NZDT
            start last sunday september 02:00
            end first sunday april 03:00
        exit
        zone NZST 
    exit
    thresholds
        rmon
        exit
    exit
exit

文件中的所有文字都格式化为4个空格和exit

我想过将上面的文字改为

<?xml version="1.0" encoding="UTF-8"?>
<system>
    <name>"WR-BNG01"</name>
    <contact>"NGN OPERATIONS 0800 4 NGNOP (0800 464667)"</contact>
    <location>"Whangarei Telephone exchange"</location>
    <clli-code>"BNG_v4 "</clli-code>
    <chassis-mode>c</chassis-mode>
    <dns />
    <persistence>
        <subscriber-mgmt>
            <location>cf2:</location>
        </subscriber-mgmt>
    </persistence>
    <snmp>
        <packet-size>9216</packet-size>
    </snmp>
    <time>
        <ntp>
            <server>10.72.14.17</server>
            <server>10.74.14.26 prefer</server>
            <no>shutdown</no>
        </ntp>
        <sntp>
            <shutdown />
        </sntp>
        <dst-zone_NZDT>
            <start>last sunday september 02:00</start>
            <end>first sunday april 03:00</end>
        </dst-zone_NZDT>
        <zone>NZST</zone>
    </time>
    <thresholds>
        <rmon />
    </thresholds>
</system>

我为此编写了一个Perl脚本,但在某些情况下它不起作用。

cpu-protection
            policy 1 create
            exit
            policy 254 create
            exit
            policy 255 create
            exit
        exit

这里,XML变为

<cpu-protection>
        <policy>1 create</policy>
        <policy>254 create</policy>
        <policy>255 create</policy>
</cpu-protection>

而不是

<cpu-protection>
        <policy_1_create></policy_1_create>
        <policy_254_create></policy_254_create>
        <policy_255_create></policy_255_create>
</cpu-protection>

我的脚本(将数据更改为XML的部分)::

    foreach my $i ( 1 .. $index ) {
    @grabbed = @grabbed_1 if $i == 1;
    @grabbed = @grabbed_2 if $i == 2;

    my $currentLine;

    my $previousLineSpaceLength = 0;
    my $currentLineSpaceLength  = 0;
    my $nextLineSpaceLength     = 0;

    my @exitTags;
    my $exitTag = '';

    my @tag;
    my $lineSplits;

    my $xmlString = '<?xml version="1.0" encoding="UTF-8"?>';
    foreach my $i ( 0 .. $#grabbed ) {

        $currentLine = $grabbed[$i];
        $currentLine =~ /^\s*/;
        $currentLineSpaceLength = $+[0];

        chomp($currentLine);
        $currentLine =~ s/^\s+//;
        $currentLine =~ s/\s+$//;

        #$currentLine =~ s/"//g;

        @tag = split ' ', $currentLine, 2;
        $lineSplits = scalar @tag;

        if ( $previousLineSpaceLength == 0 ) {

            $previousLineSpaceLength = $currentLineSpaceLength;

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

          #$xmlString               = $xmlString . '<' . $currentLine . '>';
          #$exitTag                 = '</' . $currentLine . '>';

        }
        elsif ($currentLineSpaceLength > $previousLineSpaceLength
            && $exitTag ne '' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            push @exitTags, $exitTag;

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

            #$xmlString = $xmlString . '<' . $currentLine . '>';
            #$exitTag   = '</' . $currentLine . '>';
        }
        elsif ($currentLineSpaceLength == $previousLineSpaceLength
            && $currentLine ne 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            if ( $exitTag ne 'exit' ) {
                $xmlString = $xmlString . $exitTag;

            }

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

            #$xmlString = $xmlString . '<' . $currentLine . '>';
            #$exitTag   = '</' . $currentLine . '>';

        }
        elsif ($currentLineSpaceLength == $previousLineSpaceLength
            && $currentLine eq 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            $xmlString               = $xmlString . $exitTag;
            $exitTag                 = $currentLine;

        }
        elsif ($currentLineSpaceLength < $previousLineSpaceLength
            && $currentLine eq 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            if ( $exitTag ne 'exit' ) {
                $xmlString = $xmlString . $exitTag;

            }
            $xmlString = $xmlString . pop @exitTags;
            $exitTag   = $currentLine;
        }

    }

    push @XMLStrings, $xmlString;

 }

请帮助修正所需的更正。

完整代码::

#!/usr/bin/perl

use Data::Dumper;
use XML::XPath;
use XML::DOM;

my $file1 = "1.cfg";
my $file2 = "2.cfg";

my @configFiles = ( $file1, $file2 );

my $templateFile = "template.cfg";
my @configModules;
my %configurationsHash;
my $configName;

open( TEMPLATE, "<" . $templateFile ) or die "cannot open file";
while (<TEMPLATE>) {
    chomp($_);
    push @configModules, $_ if /^[[:alpha:]]/;
$configName = $_ if /^[[:alpha:]]/;
push @{ $configurationsHash{$configName} }, $_ if /^\//;
}
close(TEMPLATE);

foreach $configSubModule (@configModules) {
chomp($configSubModule);

my $index = 0;
my @grabbed_1;
my @grabbed_2;

foreach my $file (@configFiles) {
    my $last = 0, $end = 0;
    $index++;

    open( CONFIGFILE, "<" . $file );
    while (<CONFIGFILE>) {
        if (/$configSubModule/) {
            while (<CONFIGFILE>) {
                if (/#---/) {
                    $end = 1, last if $last;
                    $last = 1;
                }
                else {
                    push @grabbed_1, $_ if $index == 1;
                    push @grabbed_2, $_ if $index == 2;
                }
            }
        }
        last if $end;
    }
}

    my @grabbed;
    my @XMLStrings;

    foreach my $i ( 1 .. $index ) {
    @grabbed = @grabbed_1 if $i == 1;
    @grabbed = @grabbed_2 if $i == 2;

    my $currentLine;

    my $previousLineSpaceLength = 0;
    my $currentLineSpaceLength  = 0;
    my $nextLineSpaceLength     = 0;

    my @exitTags;
    my $exitTag = '';

    my @tag;
    my $lineSplits;

    my $xmlString = '<?xml version="1.0" encoding="UTF-8"?>';
    foreach my $i ( 0 .. $#grabbed ) {

        $currentLine = $grabbed[$i];
        $currentLine =~ /^\s*/;
        $currentLineSpaceLength = $+[0];

        chomp($currentLine);
        $currentLine =~ s/^\s+//;
        $currentLine =~ s/\s+$//;

        #$currentLine =~ s/"//g;

        @tag = split ' ', $currentLine, 2;
        $lineSplits = scalar @tag;

        if ( $previousLineSpaceLength == 0 ) {

            $previousLineSpaceLength = $currentLineSpaceLength;

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

          #$xmlString               = $xmlString . '<' . $currentLine . '>';
          #$exitTag                 = '</' . $currentLine . '>';

        }
        elsif ($currentLineSpaceLength > $previousLineSpaceLength
            && $exitTag ne '' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            push @exitTags, $exitTag;

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

            #$xmlString = $xmlString . '<' . $currentLine . '>';
            #$exitTag   = '</' . $currentLine . '>';
        }
        elsif ($currentLineSpaceLength == $previousLineSpaceLength
            && $currentLine ne 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            if ( $exitTag ne 'exit' ) {
                $xmlString = $xmlString . $exitTag;

            }

            if ( $lineSplits == 1 ) {
                $xmlString = $xmlString . '<' . $currentLine . '>';
                $exitTag   = '</' . $currentLine . '>';
            }
            elsif ( $lineSplits == 2 ) {

                $nextLine = $grabbed[ $i + 1 ];
                $nextLine =~ /^\s*/;
                $nextLineSpaceLength = $+[0];

                if ( $nextLineSpaceLength > $currentLineSpaceLength ) {
                    $xmlString =
                      $xmlString . '<' . $tag[0] . '_' . $tag[1] . '>';
                    $exitTag = '</' . $tag[0] . '_' . $tag[1] . '>';

                }
                else {
                    $xmlString = $xmlString . '<' . $tag[0] . '>' . $tag[1];
                    $exitTag   = '</' . $tag[0] . '>';
                }

            }

            #$xmlString = $xmlString . '<' . $currentLine . '>';
            #$exitTag   = '</' . $currentLine . '>';

        }
        elsif ($currentLineSpaceLength == $previousLineSpaceLength
            && $currentLine eq 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            $xmlString               = $xmlString . $exitTag;
            $exitTag                 = $currentLine;

        }
        elsif ($currentLineSpaceLength < $previousLineSpaceLength
            && $currentLine eq 'exit' )
        {
            $previousLineSpaceLength = $currentLineSpaceLength;
            if ( $exitTag ne 'exit' ) {
                $xmlString = $xmlString . $exitTag;

            }
            $xmlString = $xmlString . pop @exitTags;
            $exitTag   = $currentLine;
        }

    }

    push @XMLStrings, $xmlString;

}

print "\n", 'Configuration :: ', $configSubModule, "\n";

foreach my $path ( @{ $configurationsHash{$configSubModule} } ) {

    $path =~ s/\s+/_/g;

    print $path;
    my $XMLIndex = 0;

    my @XMLOne;
    my @XMLTwo;

    foreach my $xml (@XMLStrings) {

        $XMLIndex++;
        $xp = XML::XPath->new($xml);

        if ( $xp->find($path) ) {
            my $nodeset  = $xp->find($path);
            my @nodeList = $nodeset->get_nodelist;

            foreach my $node (@nodeList) {
                my @childNodes = $node->getChildNodes;
                foreach my $childNode (@childNodes) {
                    print "\n", $childNode->getName;
                }
            }
        }
        else {
            print $path, ' =>  Not Available in file : ' . $XMLIndex, "\n";
        }
        print "\n";
    }

}

}

需要比较两个.cfg文件并显示差异。

所需的模式取自另一个文件。

System Configuration
#/system
#/system/persistence
/system/snmp/streaming
/system/time
#/system/time/ntp
#/system/time/sntp
/system/time/dst-zone NZDT
#/system/thresholds

System Security Configuration
#/system
#/system/security
#/system/security/management-access-filter
#/system/security/management-access-filter/ip-filter

3 个答案:

答案 0 :(得分:2)

我建议使用XML::Writer。它将跟踪仍待关闭的未完成开放标签的堆栈

看起来像这样

use strict;
use warnings;

use XML::Writer;

my $writer = XML::Writer->new( DATA_MODE => 1, DATA_INDENT => '  ');

$writer->xmlDecl('UTF-8');

while ( <DATA> ) {

  next unless /^(\s*)(\S.*\S*)/;

  my $tag = $2 =~ tr/ /_/r;

  if ( $tag eq 'exit' ) {
    $writer->endTag;
  }
  else {
    $writer->startTag($tag);
  }
}

$writer->end;

__DATA__
cpu-protection
        policy 1 create
        exit
        policy 254 create
        exit
        policy 255 create
        exit
    exit

输出

<?xml version="1.0" encoding="UTF-8"?>

<cpu-protection>
  <policy_1_create></policy_1_create>
  <policy_254_create></policy_254_create>
  <policy_255_create></policy_255_create>
</cpu-protection>

答案 1 :(得分:0)

另一种选择,如果你可以依赖数据

#!/usr/bin/perl -l
use warnings;
use strict;

my @tags = ();

print qq[<?xml version="1.0" encoding="UTF-8"?>\n];

while (<DATA>) {
    chomp;
    s/^\s+//;

    if (@tags and $_ eq 'exit') {
        print sprintf("%s</%s>", (" " x (scalar(@tags)-1)), pop @tags);
    }
    else {
        tr/a-zA-Z0-9-/_/cs;
        print sprintf("%s<%s>", (" " x scalar(@tags)), $_);
        push @tags, $_;
    }
}

__DATA__
cpu-protection
        policy 1 create
        exit
        policy 254 create
        exit
        policy 255 create
        exit
    exit

输出

$ perl script3.pl
<?xml version="1.0" encoding="UTF-8"?>

<cpu-protection>
 <policy_1_create>
 </policy_1_create>
 <policy_254_create>
 </policy_254_create>
 <policy_255_create>
 </policy_255_create>
</cpu-protection>
$ 

答案 2 :(得分:0)

将数据转换为XML以比较和显示差异。

@redeemables = @business.redeemables.order('(CASE WHEN expiry_date IS NULL THEN 1 ELSE 0 END) desc, expiry_date desc')