使用perl创建csv文件

时间:2014-09-25 10:07:50

标签: perl csv

我有多个CSV文件,想要创建包含唯一条目的主文件,只需要输入的位置。我无法弄清楚要创建列的内容。

档案1

fragment
accb
bbc
ccd

文件2

fragment
ccd
llk
kks

输出

fragment  file 1        file 2
accb        1              0
bbc         1              1
ccd         1              1   
llk         0              1
kks         0              1


use strict;
use warnings;
use feature qw(say);
use autodie;

use constant {
    FILE_1 => "file1.csv",
    FILE_2 => "file2.csv",
};

my %hash;
#
# Load the Hash with value from File #1
#
open my $file1_fh, "<", FILE_1;
while ( my $value = <$file1_fh> ) {
    chomp $value;
    $hash{$value} = 1;
}
close $file1_fh;
#
# Add File #2 to the Hash
#
open my $file2_fh, "<", FILE_2;
while ( my $value = <$file2_fh> ) {
    chomp $value;
    $hash{$value} = 1;    #If that value was in "File #1", it will be "replaced"
}
close $file2_fh;

for my $value ( sort keys %hash ) {
    say $value;
}

1 个答案:

答案 0 :(得分:0)

在这种情况下,将信息编码到哈希值中是一种很好的方法。这样做的一种方法如下:

my %hash;
#
# Load the Hash with value from File #1
#
open my $file1_fh, "<", FILE_1;
while ( my $value = <$file1_fh> ) {
    chomp $value;
    $hash{$value}++;
}
close $file1_fh;
#
# Add File #2 to the Hash
#
open my $file2_fh, "<", FILE_2;
while ( my $value = <$file2_fh> ) {
    chomp $value;
    $hash{$value} += 10;   # if the key already exists, the value will now be 11
                           # if it did not exist, the value will be 10
}
close $file2_fh;

for my $k ( sort keys %hash ) 
{   if ($hash{$k} == 1) { # only in file 1
        say "$k\t0\t1";
    }
    elsif ($hash{$k} == 10) { # only in file 2
        say "$k\t1\t0";
    }
    else { # in both file 1 and file 2
        say "$k\t1\t1";
    }
}

您可以使用100,1000,10000等扩展此方法以用于多个文件

另一种可能性是建立一个更复杂的数据结构,记录存在记录的文件的名称,例如。

for my $file (@array_of_files) {
    open my $f, "<", $file or die "Could not open $f: $!";
    while (my $l = <$f>) {
        chomp($l);
        $hash{$l}{$file}++; # store the file name
    }
}

如果您有大量文件或想要更具描述性/可理解的哈希数据,这将非常有用。