而不是循环'while'找到一个模式获取特定内容

时间:2017-12-01 10:00:13

标签: perl

我有*TXT个文件(或多或少100 MB),我需要在[Main_footnn]文件中匹配关键字TXT

INPUT:

[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
 psychosocial support.[foot2],[foot7] Interpersonal skills build
 on this basic communication skill.[foot2,4]
[END_PATIENTS]

[START_PATIENTS]
Basic communication skills in isolation are insufficient
 to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
 skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]

[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
 psychosocial support.[foot12],[foot17] Interpersonal skills build
 on this basic communication skill.[foot2,90]
[END_PATIENTS]
  

注意:以上输入必须是没有商标的机会,可能是患者与患者之间单行或单个输入标记的完整内容。

CODE:

while($content=~m/\[START\_PATIENTS\]((?:(?!\[END\_PATIENTS\]).)*)\[END\_PATIENTS\]/gs)
{
    my $fulcnt = $&; my $cont = $1;
    if($cont=~m/(\[MAIN\_foot\d+\])/i)
    {
        print "$fulcnt\n";
    }
}

找到[MAIN_foot\d+\]并仅获取特定的患者内容而不是通过文件将每个患者分开。

  

例如。输出: [START_PATIENTS] ... [MAIN_foot \ d +] .... [END_PATIENTS] 需要在这里获取输出。

输出:

[START_PATIENTS]
Basic communication skills in isolation are insufficient
 to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
 skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]

2 个答案:

答案 0 :(得分:1)

您可以使用段落模式

use warnings;
use strict;

local $/ = "";

while(<DATA>)
{
    next unless (/(\[MAIN\_foot\d+\])/i);
    print ;
}


__DATA__
[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
 psychosocial support.[foot2],[foot7] Interpersonal skills build
 on this basic communication skill.[foot2,4]
[END_PATIENTS]

[START_PATIENTS]
Basic communication skills in isolation are insufficient
 to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
 skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]

[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
 psychosocial support.[foot12],[foot17] Interpersonal skills build
 on this basic communication skill.[foot2,90]
[END_PATIENTS]

标量

foreach (split(/\[END_PATIENTS\]\n+\K/,$string))
{
    next unless (/(\[MAIN\_foot\d+\])/i);
    print ;

}

答案 1 :(得分:1)

您还可以使用perl的范围运算符:

my @set;
my $wanted = 0;
while (<$fh>) {
    my $match = m/^\[START_PATIENTS\]$/ ... m/^\[END_PATIENTS\]$/;
    # print $match here to see what it contains
    if ($match) {
        push @set, $_;
        if (m/\[MAIN_foot\d+\]/) {
            $wanted = 1;
        }
        if ($match =~ m/E0$/) {
            # got end mark
            if ($wanted) {
                print for @set;
            }
            $wanted = 0;
            @set = ();
        }
    }
}

请参阅perldoc perlop - Range operators