提取在perl中以管道分隔的特定多行记录

时间:2013-03-20 01:33:51

标签: perl perl-module

我有一个看起来像

的文件
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

我想按国家/地区分开记录。我已将每一行存储到数组变量@fields

my @fields = split(/\|/, $_ );

$fields[3]作为我对其进行排序的基础。我希望它分成2个输出文本文件

输出文本文件1:

NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

输出文本文件2

NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE

将所有内容从JPN输出到输出文本1&非JPN国家输出文本文件2

这是试图解决的代码

use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);

my @fields;
my $tmp_var;
my $count;
;
my ($line, $i);

my $filename = 'data.txt';
open(my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";


open(OUTPUTA, ">", 'JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
open(OUTPUTB, ">", 'Non-JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";

my $fh;
while (<$input_fh>) {

    chomp;
   my @fields = split /\|/;


   if ($fields[0] eq 'NAME') {
    for ($i=1; $i < @fields; $i++) {
        if ($fields[3] eq 'JPN') {
           $fh = $_;
            print OUTPUTA $fh;
        }
        else {
           $fh = $_;
            print OUTPUTB $fh;
        }
    }

}   
}

close(OUTPUTA);
close(OUTPUTB)

仍然没有运气:(

4 个答案:

答案 0 :(得分:1)

你没有说出你需要什么帮助,所以我假设它正在提出一个算法。这是一个很好的:

  1. 打开要阅读的文件。
  2. 打开JPN条目的文件。
  3. 打开非JPN条目的文件。
  4. 虽然不是eof,
    1. 阅读一行。
    2. 解析这条线。
    3. 如果是记录的第一行,
      1. 如果该人的国家是JPN,
        1. 将当前文件句柄设置为JPN条目的文件句柄。
      2. 否则,
        1. 将当前文件句柄设置为非JPN条目的文件句柄。
    4. 将该行打印到当前文件句柄。

  5. my $jpn_qfn   = '...';
    my $other_qfn = '...';
    
    open(my $jpn_fh,   '>', $jpn_qfn)
       or die("Can't create $jpn_qfn: $!\n");
    open(my $other_fh, '>', $other_qfn)
       or die("Can't create $other_qfn: $!\n");
    
    my $fh;
    while (<>) {
       chomp;
       my @fields = split /\|/;
       if ($fields[0] eq 'NAME') {
          $fh = $fields[3] eq 'JPN' ? $jpn_fh : $other_fh;
       }
    
       say $fh $_;
    }   
    

答案 1 :(得分:1)

这是我认为ikegami说的方式,但我以前从未尝试过这种方式(虽然它给出了正确的结果)。

#!/usr/bin/perl
use strict;
use warnings;

open my $jpn_fh, ">", 'o33.txt' or die $!;
open my $other_fh, ">", 'o44.txt' or die $!;

my $fh;
while (<DATA>) {
    if (/^NAME/) {
        if (/JPN$/) {
            $fh = $jpn_fh;  
        }
        else {
            $fh = $other_fh;
        }
    }
    print $fh $_;
}   

close $jpn_fh or die $!;
close $other_fh or die $!;

__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

答案 2 :(得分:0)

#!/usr/bin/env perl

use 5.012;
use autodie;
use strict;
use warnings;

# store per country output filehandles
my %output;

# since this is just an example, read from __DATA__ section

while (my $line = <DATA>) {
    # split the fields
    my @cells = split /[|]/, $line;

    # if first field is NAME, this is a new record
    if ($cells[0] eq 'NAME') {
        # get the country code, strip trailing whitespace
        (my $country = $cells[3]) =~ s/\s+\z//;

        # if we haven't created and output file for this
        # country, yet, do so
        unless (defined $output{$country}) {
            open my $fh, '>', "$country.out";
            $output{$country} = $fh;
        }
        my $out = $output{$country};

        # output this and the next two lines to
        # country specific output file
        print $out $line, scalar <DATA>, scalar <DATA>;
    }
}

close $_ for values %output;

__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

答案 3 :(得分:0)

感谢您的帮助堆 我能够在perl中解决这个问题, 非常感谢

#!/usr/local/bin/perl

use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);

my @fields;
my $tmp_var;
my ($rec_type, $country);

my $filename = 'data.txt';


open (my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";


open  my $OUTPUTA, ">", 'o33.txt' or die $!;
open  my $OUTPUTB, ">", 'o44.txt' or die $!;

my $Combline;
while (<$input_fh>) {

    $_ = _trim($_); 
    @fields = split (/\|/, $_); 
    $rec_type = $fields[0];
    $country = $fields[3];

        if ($rec_type eq 'NAME') {          
            if ($country eq 'JPN') {                            
                *Combline = $OUTPUTA;
            }           
            else {                              
                *Combline = $OUTPUTB;
            }
        }       
   print  Combline;
}   

close $OUTPUTA or die $!;
close $OUTPUTB or die $!;

sub _trim {
    my $word = shift;
    if ( $word ) {      
        $word =~ s/\s*\|/\|/g;      #remove trailing spaces
        $word =~ s/"//g;        #remove double quotes
    }
    return $word;
}