匹配不同文件的列

时间:2012-10-26 12:54:52

标签: perl

我遇到这样的情况:我有一个像file1

这样的文件

文件1

List      ID 
1         NM_00012  
2         NM_00013   
2         NM_00013
3         NM_00021  
3         NM_00021
4         NM_000254
5         NM_000765

和第二个看起来像这样的文件:

file2的

List      Count 
1         Gene1 
2         Gene2
2         Gene2
3         Gene3 
3         Gene3 
4         Gene4
5         Gene5

我想要以下输出:

文件3

List       Count 
NM_00012   Gene1    
NM_00013   Gene2        
NM_00021   Gene3                
NM_000254  Gene4        
NM_000756  Gene5          

任何人都可以帮助我吗? 我是Perl的新人。

提前致谢!!

2 个答案:

答案 0 :(得分:1)

嗯,实施起来很简单直接:

open FILE1, "file1.txt";
open FILE2, "file2.txt";
open OUTPUT, ">", "output.txt";

my (%file1content, %file2content);

%file1content = ProcessFile(\*FILE1);
%file2content = ProcessFile(\*FILE2);

sub ProcessFile {
my (%ret, @arr);
my $fh = shift;
while (@arr = split(/[\s\t]+/,<$fh>)) {
next unless(scalar(@arr) == 2);
next unless(($arr[0]+0) > 0); 
$ret{$arr[0]} = $arr[1];
}
return %ret;
}   

foreach my $key (sort {$a cmp $b} keys %file1content){
print OUTPUT $file1content{$key},"\t",$file2content{$key},"\n";
}   
close (OUTPUT);
close (FILE1);
close (FILE2);

答案 1 :(得分:0)

你可以喜欢(未经测试):

my (%hash1, %hash2, $list, $count, $ID);
open F,"<","file2.txt" or die;
while(<F>) {
  chomp;
  ($list,$count) = split/\s+/;
  $hash1{$list} = $count;  
}
close F;

open F,"<","file1.txt" or die;
while(<F>) {
  chomp;
  ($list,$ID) = split/\s+/;
  if(! exits $hash2{$ID}) {
    print "$ID $hash1{$list}";
    $hash2{$ID} = 1;
  }
}
close F;