这个问题很常见,但我的情况有点不同。我有10个文件,我想提取常见的行。我找到了 - >
perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' file1 file2 file3 file4
或在linux中 - >
comm [-1] [-2] [-3 ] file1 file2
但是如果文件有3列(或更多列),我想只比较前2列(或更多)而不是最后一列 - >
file1 - >
Col1 col2 col3 A 1 0 A 2 1
file2的
Col1 col2 col3 A 2 0.5 A 1 10 B 1 10
期望的输出 - >
Col1 col2 file1 file2 A 1 0 10 A 2 1 0.5
所以在输出中,如果我有10个文件,那么应该还有10列。是否也可以作为一个衬里perl(通过修改它)或我们可以做什么?
答案 0 :(得分:1)
use strict;
use warnings;
use Array::Utils qw(intersect);
my $first_file=shift(@ARGV);
my @common_lines=();
#Grab all of the lines in the first file.
open(my $read,"<",$first_file) or die $!;
while(<$read>)
{
chomp;
my @arr=split /\t/;
@arr=@arr[0,1]; #Only take first two columns.
push @common_lines,join("\t",@arr);
}
close($read);
foreach my $file (@ARGV)
{
my @matched_lines=();
open($read,"<",$file) or die $!;
while(<$read>)
{
chomp;
my @arr=split /\t/;
@arr=@arr[0,1];
my $to_check=join("\t",@arr);
#If $to_check is in @common_lines, put it in @matched_lines
if(grep{$_ eq $to_check}@common_lines)
{
push @matched_lines,$to_check;
}
}
close($read);
#Take out elements of @common_lines that aren't in @matched_lines
@common_lines=intersect(@common_lines,@matched_lines);
unless(@common_lines)
{
print "No lines are common amongst the files!\n";
}
}
foreach(@common_lines)
{
print "$_\n";
}