Question

我的文本文件包含许多条目：

[...]
Wind: 83,476,224
Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
EnterValues: 656,136,1
Speed: 48,32
State: 2,102,83,476,224
[...]

从上面我想提取：

Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd

如果EnterValues:在每Solution:之后存在，那将很简单，但不幸的是，Speed。有时它是Solution:.*?(?<!~)\n，有时会有所不同。我不知道如何构造正则表达式的结尾（我假设它应该是这样的：{{1}}）。

我的文件\ n是新行的分隔符。

Answer 1

正如我所见，你首先将所有文件都读到内存中，但这不是一个好的实践。尝试使用触发器操作符：

while ( <$fh> ) {
   if ( /Solution:/ ... !/~$/ ) {
      print $_, "\n";
   }
}

我现在无法测试它，但我认为这应该可行。

Answer 2

您需要应用具有正则表达式功能的“记录分隔符”。不幸的是，您无法使用$/，因为它不能是正则表达式。但是，您可以将整个文件读取为一行，并使用正则表达式分割该行：

use strict;
use warnings;
use Data::Dumper;

my $str = do { 
    local $/;   # disable input record separator
    <DATA>;     # slurp the file
};
my @lines = split /^(?=\pL+:)/m, $str;  # lines begin with letters + colon
print Dumper \@lines;

__DATA__
Wind: 83,476,224
Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
EnterValues: 656,136,1
Speed: 48,32
State: 2,102,83,476,224

<强>输出：

$VAR1 = [
          'Wind: 83,476,224
',
          'Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
',
          'EnterValues: 656,136,1
',
          'Speed: 48,32
',
          'State: 2,102,83,476,224
'

我假设您将对这些变量进行某种后期处理，但我会留给您。从这里开始的一种方法是在换行符上拆分值。

Answer 3

您可以从Solution匹配单词后跟冒号

my ($solution) = $text =~ /(Solution:.*?) \w+: /xs;

多线匹配不规则的新线

3 个答案: