Question

我是perl的初学者。我有一个文本文件，文本类似于下面。我需要提取VALUE =“＆lt; NEEDED VALUE ＆gt;”。说对于SPINACH，我应该单独使用SALAD。

如何使用perl正则表达式获取值。我需要解析多行来获取它。即每个#ifonly --- #endifonly

之间

$ cat check.txt

while (<$file>)
{
   if (m/#ifonly .+ SPINACH .+ VALUE=(")([\w]*)(") .+ #endifonly/g)
{
    my $chosen = $2;
   }
}

#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes" 
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes" 
as df fg
#endifonly

Answer 1

use strict;
use warnings;
use 5.010;

while (<DATA>) {
   my $rc = /#ifonly .+ SPINACH/ .. (my ($value) = /VALUE="([^"]*)"/);
   next unless $rc =~ /E0$/;
   say $value;
}

__DATA__
#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes" 
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes" 
as df fg
#endifonly

这使用了brian d foy here描述的小技巧。如链接所述，它使用标量range operator / flipflop。

Answer 2

如果您的文件非常大（或者您想要出于其他原因逐行阅读），您可以按照以下方式执行此操作：

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my ($file, $keyword);

# now get command line options (see Usage note below)
GetOptions(
            "f=s" => \$file,
            "k=s" => \$keyword,
          );

# if either the file or the keyword has not been provided, display a
# help text and exit
if (! $file || ! $keyword) {
   print STDERR<<EOF;

   Usage: script.pl -f filename -k keyword

EOF
   exit(1);
}

my $found;         # indicator that the keyword has been found
my $returned_word; # will store the word you want to retrieve

open FILE, "<$file" or die "Cannot open file '$file': $!";
while (<FILE>) {
   if (/$keyword/) {
      $found = 1;
   }

   # the following condition will be true between all lines that
   # start with '#ifonly' or '#endifonly' - but only if the keyword 
   # has been found!
   if (/^#ifonly/ .. /^#endifonly/ && $found) {
      if (/VALUE="(\w+)"/) { 
         $returned_word = $1;
         print "looking for $keyword --> found $returned_word\n";

         last; # if you want to get ALL values after the keyword
               # remove the 'last' statement, as it makes the script
               # exit the while loop
      }
   }
}
close FILE;

Answer 3

您可以读取字符串中的文件内容，然后在字符串中搜索模式：

my $file;    
$file.=$_ while(<>);    
if($file =~ /#ifonly.+?\bSPINACH\b.+?VALUE="(\w*)".+?#endifonly/s) {
        print $1;
}

您的原始正则表达式需要进行一些调整：

你需要制作量词非贪婪。
使用s修饰符制作. 匹配换行符。

Ideone Link

Answer 4

这是基于触发器操作符的另一个答案：

use strict;
use warnings;
use 5.010;

while (<$file>)
{
  if ( (/^#ifonly.*\bSPINACH\b/ .. /^#endifonly/) &&
       (my ($chosen) = /^VALUE="(\w+)"/) )
  {
    say $chosen;
  }
}

此解决方案将第二个测试应用于范围内的所有行。用于排除开始和结束行的技巧@Hugmeir是不需要的，因为“内部”正则表达式/^VALUE="(\w+)"/无论如何都不能匹配它们（我将^锚添加到所有正则表达式以使其倍增肯定的。）

Answer 5

两天前的一个答案中的这两行

my $file;
$file.=$_ while(<>);

效率不高。 Perl可能会以大块的形式读取文件，将这些块拆分为<>的文本行，然后.=将这些行连接回来制作一个大字符串。啜饮文件会更有效率。基本样式是更改\$输入记录分隔符。

undef $/;
$file = <>;

模块File::Slurp;（请参阅perldoc File::Slurp）可能会更好。

解析perl正则表达式中的多行并提取值

5 个答案: