Question

我遇到这样的情况：我有一个像file1

这样的文件

文件1

List      ID 
1         NM_00012  
2         NM_00013   
2         NM_00013
3         NM_00021  
3         NM_00021
4         NM_000254
5         NM_000765

和第二个看起来像这样的文件：

file2的

List      Count 
1         Gene1 
2         Gene2
2         Gene2
3         Gene3 
3         Gene3 
4         Gene4
5         Gene5

我想要以下输出：

文件3

List       Count 
NM_00012   Gene1    
NM_00013   Gene2        
NM_00021   Gene3                
NM_000254  Gene4        
NM_000756  Gene5

任何人都可以帮助我吗？我是Perl的新人。

提前致谢!!

Answer 1

嗯，实施起来很简单直接：

open FILE1, "file1.txt";
open FILE2, "file2.txt";
open OUTPUT, ">", "output.txt";

my (%file1content, %file2content);

%file1content = ProcessFile(\*FILE1);
%file2content = ProcessFile(\*FILE2);

sub ProcessFile {
my (%ret, @arr);
my $fh = shift;
while (@arr = split(/[\s\t]+/,<$fh>)) {
next unless(scalar(@arr) == 2);
next unless(($arr[0]+0) > 0); 
$ret{$arr[0]} = $arr[1];
}
return %ret;
}   

foreach my $key (sort {$a cmp $b} keys %file1content){
print OUTPUT $file1content{$key},"\t",$file2content{$key},"\n";
}   
close (OUTPUT);
close (FILE1);
close (FILE2);

Answer 2

你可以喜欢（未经测试）：

my (%hash1, %hash2, $list, $count, $ID);
open F,"<","file2.txt" or die;
while(<F>) {
  chomp;
  ($list,$count) = split/\s+/;
  $hash1{$list} = $count;  
}
close F;

open F,"<","file1.txt" or die;
while(<F>) {
  chomp;
  ($list,$ID) = split/\s+/;
  if(! exits $hash2{$ID}) {
    print "$ID $hash1{$list}";
    $hash2{$ID} = 1;
  }
}
close F;

匹配不同文件的列

2 个答案: