替换perl中的两个字段

时间:2009-05-14 05:59:36

标签: perl file

我有这种文本文件

File 1
-------
ABC 123
DEF 456
GHI 111

我还有另一个档案

File 2
------
stringaa ttt stringbb yyy

Ouput
-----
stringaa ABC stringbb 123
stringaa DEF stringbb 456
stringaa GHI stringbb 111

读取文件文件1更新File2,以便生成输出,任何想法。

5 个答案:

答案 0 :(得分:1)

use strict;
use warnings;

my ($file1, $file2) = @ARGV;
open F, $file2 or die "Can't open $file2: $!\n";
$_ = <F>; # File2 should have one line only
close F;

die "$file2 in unexpected format for second file '$_'\n" unless /(\w+)\s\w+\s(\w+)/;
my ($stra, $strb) = ($1, $2);

open F, $file1 or die "Can't open $file1: $!\n";
while(<F>)
{
    s/(\w+)\s(\d+)/$stra $1 $strb $2/;
        print;
}

答案 1 :(得分:1)

即使我不确定这是你想要的(见评论)。这是一种获得输出的方法:

vinko@parrot:~$ more file1.txt
ABC 123
DEF 456
GHI 111
vinko@parrot:~$ more file2.txt
stringaa ttt stringbb yyy
vinko@parrot:~$ more concat.pl
use strict;
use warnings;

open (F1,"<",file1.txt) or die $!;
open (F2,"<",file2.txt) or die $!;

while (<F2>) {
        my ($field1, $field2, $field3, $field4) = split /\s/;
        while (<F1>) {
                my ($innerfield1, $innerfield2) = split /\s/;
                print "$field1 $innerfield1 $field3 $innerfield2\n";
        }
}
close F1;
close F2;
vinko@parrot:~$ perl concat.pl
stringaa ABC stringbb 123
stringaa DEF stringbb 456
stringaa GHI stringbb 111

答案 2 :(得分:1)

试试这个:

my $file1 = shift @ARGV;
my $file2 = shift @ARGV;

open F2, $file2 or die $!;
chomp(my $template = <F2>);
my @fields = split/\s+/,$template;
close F2;

open F1, $file1 or die $!;
while (<F1>) {
    chomp;
    ($val1,$val2) = split/\s+/;
    print join("\t",$fields[0],$val1,$fields[2],$val2),"\n";

}
close F1;

答案 3 :(得分:1)

此代码比此处发布的其他建议更详细。

但它有几个优点:

  • 评论。
  • 它使用词法文件句柄和3参数open()。
  • 变量名称是描述性的,而不是file1和file2。
  • 更灵活
    • 轻松添加/修改替换字段。
    • 在一个脚本中轻松处理多个数据文件
    • 轻松将相同数据应用于多种规格
  • 除了进行替换外,不会拆分或修改规范。

虽然这与实际使用的这个好设计无关,但这段代码演示了几种有用的技术。

  • 它会生成闭包以处理格式化。
  • 它使用原子异常处理而不是有缺陷的eval {}; if ($@) { ...handle exception... }习语。

#!/usr/bin/perl

use strict;
use warnings;

# Supply test data - remove from real code.
my $test_data = <<'END';
ABC 123
DEF 456
GHI 111
JKL
MNO 999 888
END

my $test_spec = <<'END';
stringaa ttt stringbb yyy
END

# Use test data if no files specified.
# works because you can open() a scalar ref as a file.
# remove from real code -> should display usage information and die.
my $file_data = shift @ARGV || \$test_data;
my $file_spec = shift @ARGV || \$test_spec;

# List of tokens to replace in spec file.
# Real code should probably take list of tokens as argument.
my @replace = qw( ttt yyy );

my $spec   = Read_Spec_From_File( $file_spec );
my $format = Make_Formatter( $spec, @replace );
Print_Formatted_Data( $format, $file_data );

exit;

# -----------------------------------------------------------


# Read a specification from a file.
sub Read_Spec_From_File {
    my $file = shift;   # path to file

    open( my $fh, '<', $file )
        or die "Unable to open format specification file '$file' - $!\n";

    my $spec;

    local $_;
    while( <$fh> ) {

        die "Specification file format error - too many lines.\n"
            if defined $spec;

        $spec = $_;
    }

    die "Specification file format error - no specification.\n"
        unless defined $spec;


    return $spec;
}

# Create a formatting function that can be used to apply data to a
# specification.
#
# Formatting function takes a list of data values to apply to replacement
# tokens.
#
# Given spec 'word aaa blah bbb cheese ccc bar aaa'
# With token list is 'aaa', 'bbb', 'ccc',
# and data 111, 222, 333
# The result is 'word 111 blah 222 cheese 333 bar 111'
# 
sub Make_Formatter {
    my $spec = shift;
    my @replacement_tokens = @_;

    # formatter expects a list of data values.
    return sub {
        my $new_line = $spec;

        die "More data than tokens\n" 
            if @_ > @replacement_tokens;

        for ( 0..$#replacement_tokens ) {

            my $token = $replacement_tokens[$_];
            my $value = $_[$_];


            if ( not defined $value ) {
                die "No data for '$token'\n"; 
                $value = '<UNDEF>';
            }

            $new_line =~ s/$token/$value/g;

        }

        return $new_line;
    };
}

# Process a data file and print a set of formatted data.
sub Print_Formatted_Data {
    my $format    = shift; # Formatter function
    my $data_file = shift; # Path to data file.

    open( my $data_fh, '<', $data_file )
        or die "Unable to open data file '$data_file' - $!\n";

    while ( my $raw_data = <$data_fh> ) { 
        my @data_set  = split /\s+/, $raw_data;

        eval { 
            my $formatted = $format->(@data_set);

            print $formatted;
            1;
        }
        or do {
            warn "Error processing line $. of '$data_file' - $@";
        }

    }
}

答案 4 :(得分:0)

希望这对你有用。

#! /usr/bin/env perl
use strict;
use warnings;
use 5.010;
use autodie;

my($in_file,$filter,$out_file);

if( @ARGV == 0 ){
  die "Must have filter at least\n";
}elsif( @ARGV == 1 ){
  ($filter) = @ARGV;
}elsif( @ARGV >= 2 ){
  ($in_file,$filter) = @ARGV;
}else{
  ($in_file,$filter,$out_file) = @ARGV;
}


{
  # autodie checks open() for errors
  # so we don't have to
  my($IN,$OUT);
  if( defined $in_file ){
    open $IN,  '<', $in_file;
  }else{
    $IN = *STDIN{IO};
  }
  if( defined $out_file ){
    open $OUT, '>', $out_file;
  }else{
    $OUT = *STDOUT{IO};
  }

  ProcessFiles($IN,$OUT,$filter);

  close $OUT;
  close $IN;
}

sub ProcessFilter{
  my($filter,$str) = @_;

  my @elem = grep {$_} split ' ', $str;

  $filter =~ s/\$(?|(?:{(\d+)})|(\d+))/ $elem[$1-1] /eg;

  return $filter;
}
sub ProcessFiles{
  my($IN,$OUT,$filter) = @_;

  while( my $line = <$IN> ){
    chomp $line;
    next unless $line;
    $line = ProcessFilter($filter,$line);
    say {$OUT} $line;
  }
}

以下列方式之一呼叫

perl program.pl <input-file> 'filter string' <output-file>
perl program.pl <input-file> 'filter string' # sends to STDOUT
perl program.pl 'filter string' # recieves from STDIN, sends to STDOUT

如果这样调用

program.pl FILE1 'stringaa ${1} stringbb $2'

它读取FILE1并输出:

stringaa ABC stringbb 123
stringaa DEF stringbb 456
stringaa GHI stringbb 111