文件1:
col1 col2 col3 col4 ..... col 15
文件2:
col1 col2 col3 col4 ..... col 15
文件3:
col1 col2 col3 col4 ..... col 15 每列文件都有数据。
我需要比较三个文件的前四列,并输出文件3中的公共文件以及文件1 col5。
输出:
文件3(col1 col2 col3 col4 ..... col 15)+文件1(col5)
我的代码:
#!/usr/bin/perl -w
use strict;
use warnings;
my $file1 = $ARGV[0];
my $file2 = $ARGV[1];
my $file3 = $ARGV[2];
if($file1 eq "" || $file2 eq "" || $file3 eq "")
{
print "Incomplete parameters!\n";
exit;
}
open(FILE1, $file1);
open(FILE2, $file2);
open(FILE3, $file3);
open my $f, '>', "output.txt" or die "Cannot open output.txt: $!";
my @arr1=<FILE1>;
my @arr2=<FILE2>;
my @arr3=<FILE3>;
close FILE1;
close FILE2;
close FILE3;
my %chash;
for (@arr1)
{
chomp;
my($col1,$col2,$col3,$col4,$col5,$rest)=split(/\t/);
my $ckey="$col1$col2$col3$col4";
$chash{$ckey}=1;
}
for (@arr2)
{
chomp;
my($hit1,$hit2,$hit3,$hit4,$hit5,$rest)=split(/\t/);
my $ckey="$hit1$hit2$hit3$hit4";
$chash{$ckey}++;
}
for (@arr3)
{
chomp;
my($val1,$val2,$val3,$val4,$rest)=split(/\t/);
my $ckey="$val1$val2$val3$val4";
$chash{$ckey}++;
if($chash{$ckey} == 3)
{
# this key has been seen in both previous files
print $f "$_\n";
}
}
此代码仅提供公共行。任何正文帮我提取文件1 col5和File 3公共线。
答案 0 :(得分:0)
到达print语句时,$ col5值超出范围。因此,以相反的顺序处理文件,以便在使用print语句时$ col5在范围内。
for (@arr3)
{
chomp;
my($val1,$val2,$val3,$val4,$rest)=split(/\t/);
my $ckey="$val1$val2$val3$val4";
$chash{$ckey} =1;
}
for (@arr2)
{
chomp;
my($hit1,$hit2,$hit3,$hit4,$rest)=split(/\t/); # you don't need $hit5 here
my $ckey="$hit1$hit2$hit3$hit4";
$chash{$ckey}++;
}
for (@arr1)
{
chomp;
my($col1,$col2,$col3,$col4,$col5,$rest)=split(/\t/);
my $ckey="$col1$col2$col3$col4";
$chash{$ckey}++;
if($chash{$ckey} == 3)
{
# this key has been seen in both previous files
print $f "$_, $col5\n"; # $col5 is in scope
}
}