使用多个搜索模式列表在多个.png文件名中搜索,并将结果复制到新文件夹

时间:2013-12-03 18:19:28

标签: linux perl find

我有一个名为search.txt的文件,其中包含多个搜索模式

示例“search.txt”(总共超过300个条目):

A28
A32
A3C
A46
A50
A5A
898
8A2
8AC
8B6
8C0

我要搜索的文件夹中的示例文件(总共超过5000):

 1_0_1_4AB_3_56_300000_0_0_0.png
 1_0_1_5A0_20_56_300000_0_0_0.png
 1_0_1_A28_22_56_300000_0_0_0.png
 1_0_1_A32_22_56_300000_0_0_0.png
 1_0_1_A96_23_56_300000_0_0_0.png
 1_0_1_898_21_56_300000_0_0_0.png

我需要针对search.txt中的所有条目检查所有.png的第四个字符串(字符串由“_”分隔) 我之前使用过类似的perl脚本:

match4th.pl

#!/usr/bin/perl -w
use strict;
my $pat = qr/$ARGV[0]/;
while (<STDIN>) {
    my (undef, undef, undef, $fourth) = split /_/;
    print if defined($fourth) && $fourth =~ $pat;
}

然后我会使用类似的东西来执行sccript并将匹配文件移动到新位置:

cd /png_folder
find . -name '*.png' | perl match4th.pl '/tmp/search.txt' | xargs mv -t /tmp/results

我不确定的部分是如何告诉find命令使用/tmp/search.txt中的所有条目而不是将每个模式写入find命令 我也更喜欢复制文件而不是移动它们

3 个答案:

答案 0 :(得分:2)

您可以直接将search.txt文件用作grep的模式列表:

find . -name '*.png' | grep -f search.txt | xargs ...

或者如果你想让模式更严格,你可以这样做:

find . -name '*.png' | grep -f <(sed s/^/[0-9]_[0-9]_[0-9]_/ search.txt)

甚至更严格:

find . -name '*.png' | grep -f <(sed s?^?/[0-9]_[0-9]_[0-9]_? search.txt)

更严格的是:

find . -name '*.png' | grep -f <(sed 's?.*?/[0-9]_[0-9]_[0-9]_&_?' search.txt)

在最后一行中,search.txt中的整行匹配(.*),在替换中我们前缀为模式/[0-9]_[0-9]_[0-9]_,后跟匹配的字符串({{ 1}}),然后是&。例如,如果您在_中使用字母A作为模式,则会生成该行的模式search.txt,这将使您的文件与/[0-9]_[0-9]_[0-9]_A_正确匹配那里。

如果输出看起来不错,您可以将其传送到_A_以复制匹配的文件,如下所示:

xargs

答案 1 :(得分:1)

最有效的解决方案应该是:

use strict;
use warnings;
use File::Basename; # no_chdir will cause we will get full path name
use File::Find;
use File::Copy;     # copy and move will work as shell's cp and mv

my ( $fn, $dir, $target ) = @ARGV; # script arguments

# check parameters
( stat($dir)    && -d _ ) or die "Not a dir $dir";
( stat($target) && -d _ ) or die "Not a dir $target";

# construct regexp for matching files
# use quotemeta to sanitize data read from $fn file 
my $re = join '|', map quotemeta, do {
    # open file
    open( my $fh, '<', $fn ) or die "$fn: $!";
    my @p = <$fh>;            # read all patterns
    close($fh);
    chomp @p;                 # remove end of line from patterns
    @p;                       # return of do statement
};
$re = qr/$re/;                # precompile regexp
# it makes trie for up to ten thousand patterns so match should be O(1)

sub wanted {
    my $fourth;
    lstat($_)                 # initialize special _ term
        && (
           -d _               # is directory? Return true so step in depth
        || -f _               # otherwise if is file
        && /\.png$/           # is filename in $_ ending .png
        # split by '_' to five pieces max and get fourth part (index 3) 
        && defined( $fourth = ( split '_', basename($_), 5 )[3] ) # check if defined 
        && $fourth =~ /^$re$/ # match regexp
        && do { move( $_, $target ) or die "$_: $!" } # then move using File::Copy::move
        );                    # change move to copy if you want copy file instead
}

# do not change directory so $target can be relative and move will still work well
find( { wanted => \&wanted, no_chdir => 1 }, $dir );

用法

perl find_and_move.pl /tmp/search.txt . /tmp/results

答案 2 :(得分:0)

您使用的是my $pat = qr/$ARGV[0]/;,但$ARGV[0]/tmp/search.txt。您需要实际读取该文件。

#!/usr/bin/perl -w
use strict;

my $re = do {
   my $qfn = shift(@ARGV);
   open(my $fh, '<', $qfn) or die $!;
   chomp( my @pats = <$fh> );
   my $pat = join '|', map quotemeta, @pats;
   qr/^$pat\z/
};

while (<>) {
    my $tag = (split /_/)[3];
    next if !defined($tag);
    print if /$re/;
}