使用perl脚本修改文件中读取的行

时间:2016-05-03 18:14:29

标签: regex perl scripting

我有一个文件,我将ARGV[0]或参数传递给我的perl脚本。此文件包含我将读取的文件列表,并修改单个文件中的行。我想就地修改它们而不是写入新文件。

代码是这样的: -

 use strict;
 use warnings;
 use FindBin;
 use English;
 use File::Path;

 my $list=$ARGV[0];
 open(my $WAY,'<:encoding(UTF-8)',$list) or die("could not open list file");
 foreach my $file(<$WAY>){
 chomp($file);
 open(my $ASCII,''<:encoding(UTF-8)',$file) or die("could not open list file");
 foreach my $line (<$ASCII>){
 chmop($line);
 #####Here I do important stuff as per business requirement
 ## Now a array @coloumns stores all the values by which i need to 
 ### replace this read line.
 ##This array element needs to be joined by ',' so basically i want
 ##to replace read line in-place by join(",",@coloumns,"\n");
 }
 }

我怎样才能达到同样的目标?

3 个答案:

答案 0 :(得分:3)

TLDR: aet的评论可能是你最好的选择。文本文件通常不适合进行真正的就地编辑。

您正在寻找的绳索:

如果您使用+<(读取和写入)模式而不仅仅是<(读取)模式打开文件,则可以使用tellseek来移动文件并写下你心中的内容。

为什么这是一个坏主意:

罕见的是您可能想要对文本文件进行的更改,该更改不会更改消耗的字节数。如果您的新文本更长,就像添加逗号一样,您将覆盖原始数据之后的数据。如果它更短,你将会有一些以前的字节仍然存在。

这就是对文本文件进行“更改”的程序实际上“重写”它们的原因。

即使是Perl的-i命令行开关实际上也使用了Craig Estey推荐的临时文件技术。

答案 1 :(得分:2)

我已经包含了两种方式:使用临时文件并使用下面的数组。

但是,你真的想要使用临时文件方法,因为它是 atomic 。使用数组方法,如果系统在回写期间崩溃,您的文件将被删除。临时文件方法保证是原子的,如果发生崩溃,您的文件将被删除。

那么,您对临时文件的反对意见是什么?

use strict;
use warnings;
use FindBin;
use English;
use File::Path;

my $list=$ARGV[0];
open(my $WAY,'<:encoding(UTF-8)',$list) or
    die("could not open list file -- '$list'\n");
foreach my $file (<$WAY>) {
    chomp($file);
    dotmp($file);
}
close($WAY);

sub dotmp
{
    my($file) = @_;
    my($oline);

    open(my $INPUT,'<:encoding(UTF-8)',$file) or
        die("could not open input file -- '$file'\n");

    my($tmp) = $file . ".TMP";
    open(my $OUTPUT,'>:encoding(UTF-8)',$tmp) or
        die("could not open output file\n");

    foreach my $line (<$INPUT>){
        chomp($line);

        #####Here I do important stuff as per business requirement
        ## Now a array @coloumns stores all the values by which i need to
        ### replace this read line.
        ##This array element needs to be joined by ',' so basically i want
        ##to replace read line in-place by join(",",@coloumns,"\n");

        $oline = join(",",@coloumns);

        print($OUTPUT $oline,"\n");
    }

    close($INPUT);
    close($OUTPUT);

    # NOTE: this is _atomic_ -- even if the system crashes, you'll either get
    # the whole contents before or after but _never_ a partial mashup
    rename($tmp,$file) or
        die("unable to rename '$file' -- $!\n");
}

sub doarray
{
    my($file) = @_;
    my($oline);
    my(@array);

    open(my $INPUT,'<:encoding(UTF-8)',$file) or
        die("could not open input file -- '$file'\n");

    foreach my $line (<$INPUT>){
        chomp($line);

        #####Here I do important stuff as per business requirement
        ## Now a array @coloumns stores all the values by which i need to
        ### replace this read line.
        ##This array element needs to be joined by ',' so basically i want
        ##to replace read line in-place by join(",",@coloumns,"\n");

        $oline = join(",",@coloumns);

        push(@array,$oline);
    }

    close($INPUT);

    open(my $OUTPUT,'>:encoding(UTF-8)',$file) or
        die("could not open output file\n");

    # NOTE: if the system crashes while doing this, the file will be corrupted
    foreach $oline (@array) {
        print($OUTPUT $oline,"\n");
    }

    close($OUTPUT);
}

答案 2 :(得分:0)

Tie::File可能有点矫枉过正,但它会做你想要的。要构建@contents,我只需在原始空格上拆分原始行。

use warnings;
use strict;

use Tie::File;

my $list = $ARGV[0];

open my $way, '<:encoding(UTF-8)', $list or die $!;

while (my $file = <$way>){
    chomp $file;
    tie my @contents, 'Tie::File', $file or die $!;

    for (@contents){
        my @columns = split /\s+/, $_;
        s/.*/join ', ', @columns/e;
    }

    untie @contents;
}

参数文件中指定的示例文件:

one two three
1 2 3
a b c

输出:

one, two, three
1, 2, 3
a, b, c