我有一个txt文件,其数据看起来像这样(TEST):
chr1_10524
chr1_10525
chr1_10562
chr1_8383722
chr1_201327234
chr2_123123
另一个txt文件,其数据看起来像这样(DATABASE):
chrom chromStart chromEnd name
chr1 67071812 67170812 13_Heterochrom/lo
chr1 201326377 201330777 13_Heterochrom/lo
chr1 8383613 8389213 12_Repressed
chr2 120000 130000 1_Active Promoter
我希望获得一个输出文件,其中TEST与DATABASE匹配,给出如下内容:
chr1_8383722 12_Repressed
chr1_201327234 13_Heterochrom/lo
chr2_123123 1_Active Promoter
这可以在perl上完成吗?谢谢!
答案 0 :(得分:1)
试试这个:
#!/usr/bin/perl
use warnings;
use strict;
open(my $db, "<", "database.txt") or die "Cannot open < database.txt: $!";
open(my $tst, "<", "test.txt") or die "Cannot open < test.txt: $!";
my @database;
while (<$db>) {
chomp;
my @fields = split;
push @database, \@fields;
}
while (my $line = <$tst>) {
chomp($line);
my ($chr, $pos) = split /_/, $line;
# There is not unique key can be used to detect whether an entry is exist
# in the database.
foreach my $entry (@database) {
if ($chr eq $entry->[0] && $entry->[1] <= $pos && $pos <= $entry->[2]) {
print "$line $entry->[3]\n";
}
}
}
答案 1 :(得分:0)
也许以下内容会有所帮助:
use strict;
use warnings;
my %hash;
local $" = '_';
while (<>) {
chomp;
$hash{$_} = undef;
last if eof;
}
while (<>) {
my @cols = split;
print "@cols[ 0, 1 ] $cols[-1]\n" if exists $hash{"@cols[ 0, 1 ]"};
}
命令行用法:perl script.pl TEST DATABSE [>outFile]
最后一个可选参数(不带方括号)将输出定向到文件。