Question

如果id重复，我会附加app1，app2并打印一次。

输入：

on

输出：

$modelUsers

我得到的输出：

id|Name|app1|app2    
1|abc|234|231|
2|xyz|123|215|
1|abc|265|321|
3|asd|213|235|

我的代码：

id|Name|app1|app2
1|abc|234,265|231,321|
2|xyz|123|215|
3|asd|213|235|

Answer 1

这应该可以解决问题：

%out

<强> 修改

要查看use Data::Dumper;包含的内容（如果不清楚），您可以使用

print Dumper(%out);

并通过

打印

{{1}}

Answer 2

我这样解决它：

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
use 5.14.0;

my %stuff;

#extract the header row.
#use the regex to remove the linefeed, because
#we can't chomp it inline like this. 
#works since perl 5.14
#otherwise we could just chomp (@header) later. 
my ( $id, @header ) = split( /\|/, <DATA> =~ s/\n//r );

while (<DATA>) {

    #turn this row into a hash of key-values.
    my %row;
    ( $id, @row{@header} ) = split(/\|/);
    #print for diag 
    print Dumper \%row;

    #iterate each key, and insert into $row.
    foreach my $key ( keys %row ) {
        push( @{ $stuff{$id}{$key} }, $row{$key} );
    }
}

#print for diag    
print Dumper \%stuff;

print join ("|", "id", @header ),"\n";

#iterate ids in the hash
foreach my $id ( sort keys %stuff ) {

    #join this record by '|'.
    print join('|',
        $id,
        #turn inner arrays into comma separated via map.
        map {
            my %seen;
            #use grep to remove dupes - e.g. "abc,abc" -> "abc"
            join( ",", grep !$seen{$_}++, @$_ )
        } @{ $stuff{$id} }{@header}
        ),
        "\n";
}

__DATA__
id|Name|app1|app2
1|abc|234|231|
2|xyz|123|215|
1|abc|265|321|
3|asd|213|235|

对于您的应用程序来说，这可能有点过分，但它应该处理任意列标题和重复的任意数量。我会合并他们 - 所以两个abc条目不会结束abc,abc。

输出是：

id|Name|app1|app2
1|abc|234,265|231,321
2|xyz|123|215
3|asd|213|235

Answer 3

另一种不使用哈希的方法（如果你想要更高效的内存），我的贡献在于开放：

#!/usr/bin/perl
use strict;
use warnings;
my $basedir = 'E:\Perl\Input\\';
my $file ='doctor.txt';
open(OUTFILE, '>', 'E:\Perl\Output\DoctorOpFile.csv') || die $!;
select(OUTFILE);
open(FH, '<', join('', $basedir, $file)) || die $!;

print(scalar(<FH>));
my @lastobj = (undef);
foreach my $obj (sort {$a->[0] <=> $b->[0]}
                 map {chomp;[split('|')]} <FH>) {
    if(defined($lastobj[0]) &&
       $obj[0] eq $lastobj[0])
      {@lastobj = (@obj[0..1],
                   $lastobj[2].','.$obj[2],
                   $lastobj[3].','.$obj[3])}
    else
      {
        if($lastobj[0] ne '')
          {print(join('|',@lastobj),"|\n")}
        @lastobj = @obj[0..3];
      }
}
print(join('|',@lastobj),"|\n");

请注意，拆分，没有它的第三个参数会忽略空元素，这就是你必须添加最后一个条的原因。如果你没有做一个chomp，你不会需要提供酒吧或尾随硬回车，但你必须记录$ obj [4]。

Perl：如果ID重复，需要附加两列

3 个答案: