我有*TXT
个文件(或多或少100 MB),我需要在[Main_footnn]
文件中匹配关键字TXT
INPUT:
[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
psychosocial support.[foot2],[foot7] Interpersonal skills build
on this basic communication skill.[foot2,4]
[END_PATIENTS]
[START_PATIENTS]
Basic communication skills in isolation are insufficient
to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]
[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
psychosocial support.[foot12],[foot17] Interpersonal skills build
on this basic communication skill.[foot2,90]
[END_PATIENTS]
注意:以上输入必须是没有商标的机会,可能是患者与患者之间单行或单个输入标记的完整内容。
CODE:
while($content=~m/\[START\_PATIENTS\]((?:(?!\[END\_PATIENTS\]).)*)\[END\_PATIENTS\]/gs)
{
my $fulcnt = $&; my $cont = $1;
if($cont=~m/(\[MAIN\_foot\d+\])/i)
{
print "$fulcnt\n";
}
}
找到[MAIN_foot\d+\]
并仅获取特定的患者内容而不是通过文件将每个患者分开。
例如。输出: [START_PATIENTS] ... [MAIN_foot \ d +] .... [END_PATIENTS] 需要在这里获取输出。
输出:
[START_PATIENTS]
Basic communication skills in isolation are insufficient
to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]
答案 0 :(得分:1)
您可以使用段落模式
use warnings;
use strict;
local $/ = "";
while(<DATA>)
{
next unless (/(\[MAIN\_foot\d+\])/i);
print ;
}
__DATA__
[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
psychosocial support.[foot2],[foot7] Interpersonal skills build
on this basic communication skill.[foot2,4]
[END_PATIENTS]
[START_PATIENTS]
Basic communication skills in isolation are insufficient
to create psychosocial support.[MAIN_foot2],[foot7] Interpersonal
skills build on this basic communication skill.[foot2,4]
[END_PATIENTS]
[START_PATIENTS]
Basic communication skills in isolation are insufficient to create
psychosocial support.[foot12],[foot17] Interpersonal skills build
on this basic communication skill.[foot2,90]
[END_PATIENTS]
标量
foreach (split(/\[END_PATIENTS\]\n+\K/,$string))
{
next unless (/(\[MAIN\_foot\d+\])/i);
print ;
}
答案 1 :(得分:1)
您还可以使用perl的范围运算符:
my @set;
my $wanted = 0;
while (<$fh>) {
my $match = m/^\[START_PATIENTS\]$/ ... m/^\[END_PATIENTS\]$/;
# print $match here to see what it contains
if ($match) {
push @set, $_;
if (m/\[MAIN_foot\d+\]/) {
$wanted = 1;
}
if ($match =~ m/E0$/) {
# got end mark
if ($wanted) {
print for @set;
}
$wanted = 0;
@set = ();
}
}
}