Question

我有一个要解决的问题，

我有2个文件。

文件A col1，col2，col3

文件A

201843,12345,30
201844,33333,10

文件B col1，col2，col3，col4，col5，col6

201843,12345,1,2,0,5
201843,12345,2,4,0,5
201843,12345,3,4,2,5
201843,12345,4,4,5,5
201844,33333,1,0,0,10
201844,33333,2,0,0,10
201844,33333,3,0,9,10
201844,33333,4,0,9,10
201844,33333,5,0,10,10

我需要通过两个参数计算文件B与文件库匹配的次数：参数1：我的密钥将与col1和col2匹配参数2：文件B中的Col5必须大于零。

因此，文件B中每一行的结果将是这种方式。在最后一个位置添加一个新的Col。

201843,12345,3,4,2,5,2
201843,12345,4,4,5,5,2
201844,33333,3,0,9,10,3
201844,33333,4,0,9,10,3
201844,33333,5,0,10,10,3

但是我得到了这个结果，但我不想要它：

201843,12345,3,4,2,5,5
201843,12345,4,4,5,5,5
201844,33333,3,0,9,10,5
201844,33333,4,0,9,10,5
201844,33333,5,0,10,10,5

我使用了这个脚本

#!/usr/bin/perl

use strict;
use warnings;
$|=1;

my $FILEA = $ARGV[0];
my $FILEB = $ARGV[1];

open ( FA, '<', $FILEA ) || die ( "File $FILEA Could not be found!" );
open ( FB, '<', $FILEB ) || die ( "File $FILEB Could not be found!" );


my %hash;
while ( <FA> ){
        chomp;
        my($col1, $col2, $col3) = split ",";
        $hash{$col1,$col2}=$col3;

}

my $count=0;
while ( <FB> ){
        chomp;
        my($cl1, $cl2, $cl3, $cl4, $cl5, $cl6) = split ",";
        if(exists($hash{$cl1,$cl2}) and ($cl5 > 0)){
        $count++;
        }
        if ($cl5 > 0){
                print join(",",$$cl1, $cl2, $cl3, $cl4, $cl5, $cl6,$count);
        }
}

Answer 1

类似的东西：

chkSelectDay

要获得正确的计数，您必须在打印出每一行之前先读取每行中要计数的前缀。如果已知输入文件已排序，则可以编写一个更聪明的实现以利用该优势，而不必在输出任何内容之前先读取整个文件。

Answer 2

您的代码未提供输出。它给出了语法错误。但是，当您解决该问题并添加缺少的换行符时，会得到以下提示：

201843,12345,3,4,2,5,1
201843,12345,4,4,5,5,2
201844,33333,3,0,9,10,3
201844,33333,4,0,9,10,4
201844,33333,5,0,10,10,5

请注意，计数器的值在每一行都会递增-它不会像输出中那样停留在5。

这是这里出了什么问题的重要线索。或更确切地说，这里出现的两项问题。首先，您不能只有一个计数器-您需要为每个col1 / col2组合保留一个单独的计数器。其次，在处理完两个文件之前，您无法开始打印任何输出-因为在您看到所有FileB之前，不知道这些计数器将达到什么值。

这是我重写您的代码的方式。可能会简化它，但恐怕我没有时间。

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';
open my $fh_a, '<', 'FileA' or die $!;

# Phase 1: Read FileA into a hash
# This is very similar to your existing code
my %file_a;

while (<$fh_a>) {
  chomp;
  my @cols = split /,/;
  $file_a{"$cols[0],$cols[1]"} = $cols[2];
}

# Phase 2: Process FileB
# Store data in two variables.
# %counts contains the current value of the various counters.
# @outout contains one array ref for each line you want to output.
# The sub-arrays all contain two elements.
# The first element is the input line from FileB.
# The second element is the key you need to get the correct count
# for this line.
open my $fh_b, '<', 'FileB' or die $!;

my %counts;
my @output;

while (<$fh_b>) {
  chomp;
  my @cols = split /,/;

  next unless exists $file_a{"$cols[0],$cols[1]"};
  next unless $cols[4] > 0;

  ++$counts{"$cols[0],$cols[1]"};
  push @output, [$_, "$cols[0],$cols[1]"];
}

# Phase 3: Produce output.
# Walk the @output array and display a) the original line
# from FileB and b) the final counter value for lines of that type.
for (@output) {
  say join ',', $_->[0], $counts{$_->[1]};
}

在两个文件Perl之间计数值

2 个答案: