在我的perl脚本中,我想在许多目录中的某些文件中寻找一些潜在的正则表达式匹配。
我有一个哈希
my %qc = ("QCNM Daily QC" => "GUN",
"Intrinsic Flood QA" => "PUN");
将会大幅增长。在目录$STUDY_DIR
中,我想查看所有图像头文件(image1.hd
,image2.hd
等)并查找文本中是否存在任何哈希键。图像头文件只是纯文本文件。例如,我想查询image1.hd以查看文本“QCNM Daily QC”或“Intrinsic Flood QA”是否存在“如果QCNM每日QC存在我想设置变量$ study_type =”GUN“,类似”如果内在洪水“ QA“匹配我想设置$ study_type =”PUN“。如果找不到匹配,我想继续下一个图像文件。
到目前为止,这是我的代码
#Loop through all images
for ( my $i = 1; $i <= $num_images; $i++ ) {
# Check image is of type described in %qc
# We are only interested in manipulating these files
my $match = 0; #matched qc key to image header
my $study_type; #key value for when hash key is found in image header (eg PUN)
#reset the internal iterator so a prior each() doesn't affect the loop
keys %qc;
while ( my ( $k, $v ) = each %qc ) {
my @match = grep {/$k/} glob("$STUDY_DIR/image${i}.hd");
$match = 1 if match is found then break out of loop;
}
next if $match == 0; #Not a QC image we are interested in skip to next image
}
我正在努力做的是遍历每个哈希键,看看该文本是否存在于image.hd中。如果确实存在,我想设置$ match = 1和$ study_type =%qc {key}并退出循环。如果它不存在,我想继续下一个潜在的匹配。散列键是互斥的。尽管文本文件中可能没有匹配的密钥,但只有一个密钥可以存在对文件。
解决方案必须是perl,因为我有一些额外的perl命令可以在匹配的文件上执行。
答案 0 :(得分:2)
您需要实际加载文件内容以测试它们是否包含某些字符串。
我建议使用哈希键构建一个正则表达式来与之比较。
以下打印出每个文件中的第一个匹配值,然后移动到下一个文件。注意,我使用Sort::Key::Natural natsort
按自然顺序处理文件,但这只是一种样式首选项。
use strict;
use warnings;
use autodie;
use Sort::Key::Natural qw(natsort);
my $STUDY_DIR = '...';
my %qc = (
"QCNM Daily QC" => "GUN",
"Intrinsic Flood QA" => "PUN"
);
my $qc_re = '(?:' . join('|', map quotemeta, sort {length $b <=> length $a} keys %qc) . ')';
FILE:
for my $file ( natsort glob("$STUDY_DIR/image*.hd") ) {
open my $fh, '<', $file;
while (<$fh>) {
if (/($qc_re)/) {
print "$qc{$1} - $file\n";
next FILE;
}
}
}
答案 1 :(得分:1)
以下解决方案虽然有点麻烦似乎有效。我确信一个合理的perl程序员可以将代码行数减半。
#Loop through all images
for ( my $i = 1; $i <= $num_images; $i++ ) {
# Check image is of type described in %qc
# We are only interested in moving these files to QC filestore
my $match = 0; #matched qc key to image header
my $study_type = ""; #key value for when hash key is found in image header (eg PUN)
my $image_header = "$STUDY_DIR/image${i}.hd";
#reset the internal iterator so a prior each() doesn't affect the loop
keys %qc;
while ( my ( $k, $v ) = each %qc ) {
open my $FH, $image_header or die "Could not open $image_header: $!";
my (@lines) = grep /$k/, <$FH>;
#If we get a match update required fields
$match = 1 and $study_type = $qc{$k} if ( $#lines > 0 );
close $FH;
last if $match = 1;
}
print "$match, $study_type\n";
next if $match == 0; #Not a QC image we are interested in skip to next image
}
答案 2 :(得分:0)
您可以使用List::MoreUtils中的any
。它在第一场成功的比赛中退出。
@images = glob("$STUDY_DIR/image${i}.hd)";
if (any { $_ =~ /$k/ } @images){
$match = 1;
last;
}
$study_type = $qc{$key} and last if $match == 1;