Perl:在多个文本文件中查找并替换特定字符串

时间:2013-11-15 07:05:48

标签: regex perl shell replace

我需要获取给定目录中的所有.config文件,并且在每个文件中我需要搜索特定字符串并根据文件替换另一个。

例如,如果我在给定目录中有3个文件:

 for  my_foo.config - string to search "fooCommon >" replace with "~ /fooCommon[\/ >"
 for  my_bar.config - string to search "barCommon >" replace with "~ /barCommon[\/ >"
 for  my_file.config - string to search "someCommon >" replace with "~ /someCommon[\/ >"

请告诉我如何在Perl中完成此操作?

以下是我在shell脚本中尝试的代码:

OLD="\/fooCommon >"
NEW="~ \"\/fooCommon[^\/]*\" >"
DPATH="/myhome/aru/conf/host*.conf"
BPATH="/myhome/aru/conf/bakup"
TFILE="/myhome/aru/out.tmp.$$"
[ ! -d $BPATH ] && mkdir -p $BPATH || :
for f in $DPATH
do
  if [ -f $f -a -r $f ]; then
   /bin/cp -f $f $BPATH
   echo sed \"s\/$OLD\/$NEW\/g\"
   sed "s/$OLD/$NEW/g" "$f" > $TFILE && mv $TFILE "$f"
  else
   echo "Error: Cannot read $f"

fi
done
/bin/rm $TFILE

7 个答案:

答案 0 :(得分:23)

如果您使用的是Unix平台,可以在命令行上使用Perl进行操作;无需编写脚本。

perl -i -p -e 's/old/new/g;' *.config

为了更安全,您可能希望将该命令与备份选项一起使用。

perl -i.bak  -p -e 's/old/new/g;' *.config

答案 1 :(得分:10)

这里的Perl只是修改文件...我不明白为什么要在perl中编写它,如果你能做到这么简单:

find . -maxdepth 1 -type f -name '*.conf' | \
    xargs perl -i.bak -pe 's/localhost/example.com/;'

答案 2 :(得分:2)

如果你真的只需要用perl做这个,我不建议这样做,因为已经发布了优秀而简单的答案,这里有:

#!/usr/bin/perl

# take the directory to be processed from first command line argument
opendir($dh, $ARGV[0]);
# take only relevant files ie. "*.config"
@cfgs = grep { /\.config$/ } readdir($dh);
# loop through files
foreach(@cfgs) {
  # generate source string from the filename
  ($s) = ($_ =~ /.*_(\w+)\.config.*/);
  $s = "${s}Common";
  # generate replacement string from the filename
  $r = "~ /${s}[/ >";
  # move original file to a backup
  rename("${ARGV[0]}${_}", "${ARGV[0]}${_}.bak");
  # open backup file for reading
  open(I, "< ${ARGV[0]}${_}.bak");
  # open a new file, with original name for writing
  open(O, "> ${ARGV[0]}${_}");
  # go through the file, replacing strings
  while(<I>) { $_ =~ s/$s/$r/g; print O $_; }
  # close files
  close(I);
  close(O);
}

# end of file.

请注意,使用简单的find和/或shell通配符执行此操作要简单得多。但是把它作为一个关于如何用perl处理文件的小教程。

答案 3 :(得分:1)

虽然可以从命令行完成,但有时您只需要一个易于使用的脚本来提供更有用的输出。考虑到这一点,这里有一个perl解决方案,对于遇到这个问题的任何人都有友好的输出。

#!/usr/bin/env perl5.8.3

# subst [-v] [-f] "re/string to find" "string to replace" -- list of files
#  optional -v flag shows each line with replacement, must be 1st arg to script
#  optional -f flag says to disable regexp functionality and make the strings match exactly
#  replacement string may include back references ($1, $2, etc) to items in "string to find" if they are surrounded by grouping parenthesis

use strict;
use warnings;
use List::Util;
use IO::File;
use Fcntl;
use Getopt::Long qw(GetOptions);

my $verbose = 0;
my $fixed   = 0;

GetOptions("v" => \$verbose,
           "f" => \$fixed);

my $find    = shift @ARGV;
my $replace = shift @ARGV;

die "Error: missing 1st arg, string to find\n"         if not defined $find;
die "Error: missing 2nd arg, string to replace with\n" if not defined $replace;
die "No files were specified\n"                        if @ARGV == 0;

# open a temp file for writing changes to
my $TEMP = IO::File->new_tmpfile;
if (not defined $TEMP)
{
    print STDERR "ERROR: failed to create temp file: $!\n";
    exit 1;
}

# Fix max file name width for printing
my $fwidth = List::Util::max map { length $_ } @ARGV;

# Process each file
my $unchanged = 0;
my $changed   = 0;
foreach my $file (@ARGV)
{
    if (open(my $FILE, '<', $file))
    {
        # Reset temp file
        seek $TEMP, 0, SEEK_SET or die "ERROR: seek in temp file failed: $!";
        truncate $TEMP, 0       or die "ERROR: truncate of temp file failed: $!";

        # go through the file, replacing strings
        my $changes = 0;
        while(defined(my $line = <$FILE>))
        {
            if ($line =~ m/$find/g)
            {
                print "-" . $line if $verbose;
                print "\n" if $verbose and $line !~ m/\n$/;

                if ($fixed)
                {
                    my $index = index($line, $find);
                    substr($line, $index, length($find)) = $replace;
                }
                else
                {
                    $line =~ s/$find/replacebackrefs($replace)/eg;
                }

                $changes++;
                print "+" . $line if $verbose;
                print "\n" if $verbose and $line !~ m/\n$/;
            }

            print $TEMP $line;
        }
        close $FILE;

        if ($changes == 0)
        {
            $unchanged++;
            unlink("/tmp/subst$$");
            next;
        }

        # Move new contents into old file
        $changed++;
        printf "%*s - %3d changes\n", -$fwidth, $file, $changes;

        seek $TEMP, 0, SEEK_SET or die "ERROR: rewind of temp file failed: $!";
        open $FILE, '>', $file or die "ERROR: failed to re-write $file: $!\n";
        while (<$TEMP>) { print $FILE $_ }
        close $FILE;

        print "\n" if $verbose;
    }
    else
    {
        print STDERR "Error opening $file: $!\n";
    }
}

close $TEMP;

print "\n";
print "$changed files changed, $unchanged files unchanged\n";

exit 0;

sub replacebackrefs
{
    # 1st/only argument is the text matched
    my $matchedtext = shift @_;

    my @backref;
    # @- is a dynamic variable that holds the offsets of submatches in
    # the currently active dynamic scope (i.e. within each regexp
    # match), corresponding to grouping parentheses. We use the count
    # of entrees in @- to determine how many matches there were and
    # store them into an array. Note that @- index [0] is not
    # interesting to us because it has a special meaning (see man
    # perlvar for @-)\, and that backrefs start with $1 not $0.
    # We cannot do the actual replacement within this loop.
    do
    {
        no strict 'refs'; # turn of warnings of dynamic variables
        foreach my $matchnum (1 .. $#-)
        {
            $backref[$matchnum] = ${$matchnum}; # i.e. $1 or $2 ...
        }
    } while(0);

    # now actually replace each back reference in the matched text
    # with the saved submatches.
    $matchedtext =~ s/\$(\d+)/$backref[$1]/g;

    # return a scalar string to actually use as the replacement text,
    # with all the backreferences in the matched text replaced with
    # their submatch text.
    return $matchedtext;
}

答案 4 :(得分:0)

也许以下内容会有所帮助:

use strict;
use warnings;

my %replacements =
  map { chomp; my @x = split /\|/; $x[0] => [ $x[1], $x[2] ] } <DATA>;

local $^I = '.bak';

for my $file (<*.config>) {
    push @ARGV, $file;

    while (<>) {
        s/\b\Q$replacements{$file}[0]/$replacements{$file}[1]/g;
        print;
    }
}

__DATA__
my_foo.config|fooCommon >|~ /fooCommon[/ >
my_bar.config|barCommon >|~ /barCommon[/ >
my_file.config|someCommon >|~ /someCommon[/ >

数组哈希(HoA)由split | - 分隔的DATA行构建,其中键是文件名,值是对两个元素的匿名数组的引用用于替换文件。 local $^I = '.bak'表示法会创建原始文件的备份。

您可能需要调整替换。例如,使用\b中的s/\b\Q$replacements{$file}[0]/$replacements{$file}[1]/g;在替换中观察到字边界。您可能需要也可能不需要(或想要)。

我建议首先在一个'scratch'文件中尝试它,以确保在完全实现之前获得所需的结果 - 即使原始文件已备份。

答案 5 :(得分:0)

您的脚本是一个很好的尝试。

它包含一些冗余:

  • cp $f
  • 没用
  • $TFILE也没用(只需将sed输出直接写入目标文件)

您可以在没有目录路径的情况下从$NEW的值构造$f和目标文件名,您可以按如下方式获取:

bf=`basename "$f"`

答案 6 :(得分:0)

对于那些想要递归替换目录及其子目录的所有文本文件中的字符串的人来说,这个单行代码可能有用:

grep -r OLD_STRING * | cut -d':' -f1 | uniq | xargs perl -i -pe 's/OLD_STRING/NEW_STRING/g;'