Question

我有一个原始文件，其中包含以下列，

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,C,Sell,0.25,2000
02-May-2018,JPM,Sell,0.25,3000
02-May-2018,WFC,Sell,0.25,5000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,GOOG,Sell,0.25,8000
02-May-2018,GOOG,Sell,0.25,9000
02-May-2018,C,Sell,0.25,2000
02-May-2018,AAPL,Sell,0.25,3000

我正在尝试打印此原始行，如果我在第二列中看到的值超过2次...例如，如果我看到AAPL超过2倍，则应该打印所需的结果

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

到目前为止，我写了以下内容，多次打印结果是错误的..你能帮忙解决我做错了什么吗？

open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";
%count = ();
@symbol = ();
while ($line = <FILE>)
{
        chomp $line;
        (@data) = split(/,/,$line);
         $count{$data[1]}++;
        @keys = sort {$count{$a} cmp $count{$b}} keys %count;
        for my $key (@keys)
        {
        if ( $count{$key} > 2 )
        {
            print "$line\n";
        }
     }
}

Answer 1

我做这样的事情 - 存储你在＆＃39;缓冲区中看到过的行＆＃39;如果条件被击中（在继续打印之前）再打印出来：

#!/usr/bin/env perl

use strict;
use warnings;

my %buffer; 
my %count_of;

while ( my $line = <> ) {  
   my ( $date, $ticker, @values ) = split /,/, $line; 
   #increment the count
   $count_of{$ticker}++;

   if ( $count_of{$ticker} < 3 ) { 
       #count limit not hit, so stash the current line in the buffer. 
       $buffer{$ticker} .= $line; 
       next;
   }
   #print the buffer if the count has been hit
   if ( $count_of{$ticker} == 3 ) {
       print $buffer{$ticker};
   }
   #only gets to here once the limit is hit, so just print normally.
   print $line;
}

使用输入数据，输出：

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

Answer 2

简单回答：

push @{ $lines{(split",")[1]} }, $_ while <>;
print @{ $lines{$_} } for grep @{ $lines{$_} } > 2, sort keys %lines;

perl program.pl inputfile > outputfile

Answer 3

您需要阅读输入文件两次，因为在到达文件末尾之前，您还不知道最终的计数

use strict;
use warnings 'all';

my ($TMPFILE, $TMPFILE1) = qw/ infile outfile /;

my %counts;

{
    open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";

    while ( <$fh> ) {
        my @fields = split /,/;
        ++$counts{$fields[1]};
    }
}

open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";
open my $out_fh, '>', $TMPFILE1 or die "Could not open $TMPFILE1: $!";

while ( <$fh> ) {
        my @fields = split /,/;
        print $out_fh $_ if $counts{$fields[1]} > 2;
}

输出

02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000

Answer 4

这应该有效：

use strict;
use warnings;

open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";

my %data;
while ( my $line = <FILE> ) {
    chomp $line;
    my @line = split /,/, $line;
    push(@{$data{$line[1]}}, $line); 
}

foreach my $key (keys %data) {
    if(@{$data{$key}} > 2) {
        print "$_\n" foreach  @{$data{$key}};
    }
}

列中的值重复

4 个答案:

输出