如何使用perl代码搜索和替换xml代码段

时间:2013-12-10 08:55:41

标签: xml perl

我在一个文件中有这个xml代码片段,我想用perl注释掉它。

<plugin id="InvalidObjectsCheck"
        description="Verifying No Database Invalid Objects Exist"
        invoke=""
        plugin.class="oracle.check.apps.InvalidObjectsCheckPlugin"
        class.path="$HC_LOCATION/lib/precheckplugin.jar;
                    $HC_LOCATION/lib/hccommon.jar;
                    $APPLICATIONS_BASE/fusionapps/applications/lcm/ad/java/adjava.jar"
        stoponerror="false"/>

不幸的是,其他DTD的结束标记(stoponerror =“false”/&gt;)也是如此。我已经为此编写了一个子程序,但是一旦我读完文件的全部内容就不知道如何继续。

sub skip_lp_hc($skiplphc)
    {
        $hcfile = "${sharedStorageLoc}${fs}${podName}${fs}${releaseTo}${fs}fusionapps${fs}applications${fs}lcm${fs}hc${fs}config${fs}ga${fs}GeneralSystemHealthChecks.xml";
        if(! -e "$hcfile")
        {
            print "[ERROR]: Unable to locate the manifest file : GeneralSystemHealthChecks.xml\n";
            print "[INFO] : Located at ${sharedStorageLoc}${fs}${podName}${fs}${releaseTo}${fs}fusionapps${fs}applications${fs}lcm${fs}hc${fs}config${fs}ga\n";
        }
        if($skiplphc eq true)
        {
            print "[SUCCESS] : Able to access GeneralSystemHealthChecks.xml\n";
            print "[INFO] : ${sharedStorageLoc}${fs}${podName}${fs}${releaseTo}${fs}fusionapps${fs}applications${fs}lcm${fs}hc${fs}config${fs}ga\n";
            open (IN, "<$hcfile") or die "Cannot open the file : $hcfile\n";
            while (my $my_line = <IN> )
                {
                        chomp $my_line;
                        $my_line =~ s/^\s+//;
                        $my_line =~ s/\s+$//;
                        if ($my_line =~ m/<plugin id="InvalidObjectsCheck"\=(.*)/)
                        {
                          print "[INFO] : Found a match for <plugin id=InvalidObjectsCheck            after accessing GeneralSystemHealthChecks.xml\n";
                        }
                }
                close (IN);
        }
    }

2 个答案:

答案 0 :(得分:1)

我会让你处理疯狂的文件名,并为你提供一个简单的基于XML :: Twig的解决方案,让你无法解析XML-parsing-with-regexp Hell:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

XML::Twig->new( twig_roots => { plugin => \&plugin }, # process only plugin
                twig_print_outside_roots => 1,        # output rest unchanged
                pretty_print => 'cvs',                # closest style to input
              )
         ->parsefile_inplace( "my.xml");

# called for each plugin element
sub plugin
  { my( $t, $plugin)=@_;
    my $text= $plugin->sprint;
    $text=~ s{;   }{;\n}g;      # slightly massage the output to get proper format
    print "<!--$text\n  -->";   # output element commented out
  }

通过这种方式,您无需担心初始XML的细节,甚至无法读取内存中的整个文件。

答案 1 :(得分:-1)

$ {sharedStorageLoc}是一个shell语法,请在perl文件中使用$ sharedStorageLoc。

if($ skiplphc eq true)是一个shell语法,请改用if($ skiplphc)。

sub skip_lp_hc($ skiplphc)也不是一个好的perl语法。

我认为你真正的问题是你的正则表达式,它有一个额外的=并且它不匹配。

我重写了一点:

sub skip_lp_hc {
  my $skiplphc = shift;
  my $hcfile = "$sharedStorageLoc$fs$podName$fs$releaseTo$fsfusionapps$fs$applications$fs$lcm$fs$hc$fs$config$fs$ga$fs$GeneralSystemHealthChecks.xml";
  if(! -e "$hcfile") {
     print "[ERROR]: Unable to locate the manifest file : GeneralSystemHealthChecks.xml\n";
     print "[INFO] : Located at $sharedStorageLoc$fs$podName$fs$releaseTo$fsfusionapps$fs$applications$fs$lcm$fs$hc$fs$config$fs$ga\n";
  }
  if($skiplphc)  {
     print "[SUCCESS] : Able to access GeneralSystemHealthChecks.xml\n";
     print "[INFO] : $sharedStorageLoc$fs$podName$fs$releaseTo$fsfusionapps$fs$applications$fs$lcm$fs$hc$fs$config$fs$ga\n";
     open (IN, "<", $hcfile) or die "Cannot open the file : $hcfile, $!\n";
     while (my $my_line = <IN> )  {
             chomp $my_line;
             $my_line =~ s/^\s+//s;
             $my_line =~ s/\s+$//s;
             if ($my_line =~ m/<plugin\s+id=['"]InvalidObjectsCheck/i){
                 print "[INFO] : Found a match for <plugin id=InvalidObjectsCheck after accessing GeneralSystemHealthChecks.xml\n";
             }
         }
         close (IN);
  }
}