列中的值重复

时间:2018-05-31 15:00:35

标签: perl

我有一个原始文件,其中包含以下列,

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,C,Sell,0.25,2000
02-May-2018,JPM,Sell,0.25,3000
02-May-2018,WFC,Sell,0.25,5000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,GOOG,Sell,0.25,8000
02-May-2018,GOOG,Sell,0.25,9000
02-May-2018,C,Sell,0.25,2000
02-May-2018,AAPL,Sell,0.25,3000

我正在尝试打印此原始行,如果我在第二列中看到的值超过2次...例如,如果我看到AAPL超过2倍,则应该打印所需的结果

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

到目前为止,我写了以下内容,多次打印结果是错误的..你能帮忙解决我做错了什么吗?

open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";
%count = ();
@symbol = ();
while ($line = <FILE>)
{
        chomp $line;
        (@data) = split(/,/,$line);
         $count{$data[1]}++;
        @keys = sort {$count{$a} cmp $count{$b}} keys %count;
        for my $key (@keys)
        {
        if ( $count{$key} > 2 )
        {
            print "$line\n";
        }
     }
}

4 个答案:

答案 0 :(得分:0)

我做这样的事情 - 存储你在&#39;缓冲区中看到过的行&#39;如果条件被击中(在继续打印之前)再打印出来:

#!/usr/bin/env perl

use strict;
use warnings;

my %buffer; 
my %count_of;

while ( my $line = <> ) {  
   my ( $date, $ticker, @values ) = split /,/, $line; 
   #increment the count
   $count_of{$ticker}++;

   if ( $count_of{$ticker} < 3 ) { 
       #count limit not hit, so stash the current line in the buffer. 
       $buffer{$ticker} .= $line; 
       next;
   }
   #print the buffer if the count has been hit
   if ( $count_of{$ticker} == 3 ) {
       print $buffer{$ticker};
   }
   #only gets to here once the limit is hit, so just print normally.
   print $line;
}

使用输入数据,输出:

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

答案 1 :(得分:0)

简单回答:

push @{ $lines{(split",")[1]} }, $_ while <>;
print @{ $lines{$_} } for grep @{ $lines{$_} } > 2, sort keys %lines;

perl program.pl inputfile > outputfile

答案 2 :(得分:-1)

您需要阅读输入文件两次,因为在到达文件末尾之前,您还不知道最终的计数

use strict;
use warnings 'all';

my ($TMPFILE, $TMPFILE1) = qw/ infile outfile /;

my %counts;

{
    open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";

    while ( <$fh> ) {
        my @fields = split /,/;
        ++$counts{$fields[1]};
    }
}

open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";
open my $out_fh, '>', $TMPFILE1 or die "Could not open $TMPFILE1: $!";

while ( <$fh> ) {
        my @fields = split /,/;
        print $out_fh $_ if $counts{$fields[1]} > 2;
}

输出

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

答案 3 :(得分:-1)

这应该有效:

use strict;
use warnings;

open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";

my %data;
while ( my $line = <FILE> ) {
    chomp $line;
    my @line = split /,/, $line;
    push(@{$data{$line[1]}}, $line); 
}

foreach my $key (keys %data) {
    if(@{$data{$key}} > 2) {
        print "$_\n" foreach  @{$data{$key}};
    }
}