我有一个这样的文件,如下所示。
chr10 299448 299468 SRR048973.1457734 255 + 3
chr10 299448 299468 SRR048973.2114188 255 + 3
chr10 299448 299468 SRR048973.4148128 255 + 3
chr10 299945 299971 SRR048973.566192 255 + 6
chr10 299959 299982 SRR048973.762883 255 + 6
chr10 299968 299985 SRR048973.1595367 255 + 6
chr10 299968 299985 SRR048973.2828877 255 + 6
chr10 299968 299985 SRR048973.3711952 255 + 6
chr10 299968 299985 SRR048973.3821978 255 + 6
chr10 300073 300095 SRR048973.975870 255 + 1
chr10 300109 300134 SRR048973.1500469 255 + 1
chr10 300185 300209 SRR048973.655183 255 + 8
chr10 300185 300209 SRR048973.933425 255 + 8
chr10 300185 300209 SRR048973.963046 255 + 8
chr10 300185 300209 SRR048973.3506573 255 + 8
chr10 300185 300209 SRR048973.3627590 255 + 8
chr10 300186 300209 SRR048973.1133369 255 + 8
chr10 300186 300209 SRR048973.2178421 255 + 8
chr10 300186 300209 SRR048973.4047933 255 + 8
chr10 300401 300426 SRR048973.918503 255 + 2
chr10 300401 300426 SRR048973.2870188 255 + 2
查看最后一列,如果最后一列是> = 5,那么我想计算直到该列大于5的行,直到它回落到< 5。 5.对于样本文件
,我想要的输出也应如此chr10 299945 299985 6
chr10 300185 300209 8
299945来自第2列,其中前6个开始,299985来自第3列,其中最后6个结束。同样适用于8。
我想在Perl中这样做。
我尝试编写Perl脚本,但无法理解如何正确获取坐标。
#!/usr/bin/perl-w
use strict;
use warnings;
open F,'/user/tmp/output.bed',or die $!;
my $i=0;
while(<F>){
chomp;
my @s = split;
if($s[6] >= 5){
$i++;
}else{
if($s[6] < 5){
$i = 0;
}
}
}
我该怎么做。
提前致谢
此致
答案 0 :(得分:1)
use strict;
use warnings;
my @last;
while (<DATA>) {
my @cols = split ' ';
if (my $range = $cols[-1] >= 5 .. $cols[-1] < 5 || eof) {
@last = @cols[0..2,-1] if $range == 1;
print "@last\n" if $range =~ /E/;
$last[2] = $cols[2];
}
}
__DATA__
chr10 299448 299468 SRR048973.1457734 255 + 3
chr10 299448 299468 SRR048973.2114188 255 + 3
chr10 299448 299468 SRR048973.4148128 255 + 3
chr10 299945 299971 SRR048973.566192 255 + 6
chr10 299959 299982 SRR048973.762883 255 + 6
chr10 299968 299985 SRR048973.1595367 255 + 6
chr10 299968 299985 SRR048973.2828877 255 + 6
chr10 299968 299985 SRR048973.3711952 255 + 6
chr10 299968 299985 SRR048973.3821978 255 + 6
chr10 300073 300095 SRR048973.975870 255 + 1
chr10 300109 300134 SRR048973.1500469 255 + 1
chr10 300185 300209 SRR048973.655183 255 + 8
chr10 300185 300209 SRR048973.933425 255 + 8
chr10 300185 300209 SRR048973.963046 255 + 8
chr10 300185 300209 SRR048973.3506573 255 + 8
chr10 300185 300209 SRR048973.3627590 255 + 8
chr10 300186 300209 SRR048973.1133369 255 + 8
chr10 300186 300209 SRR048973.2178421 255 + 8
chr10 300186 300209 SRR048973.4047933 255 + 8
chr10 300401 300426 SRR048973.918503 255 + 2
chr10 300401 300426 SRR048973.2870188 255 + 2
输出:
chr10 299945 299985 6
chr10 300185 300209 8
答案 1 :(得分:0)
你需要计算吗?你的输出似乎没有包含它...
使用您的代码示例:
#!/usr/bin/perl-w
use strict;
use warnings;
open F,'/user/tmp/output.bed',or die $!;
my $i=0;
my $wasTheLastGreaterThan5 = 0;
while(<F>){
chomp;
my @s = split;
if(($s[6] >= 5) && !$wasTheLastGreaterThan5){
# Switched from smaller to greater than 5, do something here.
$wasTheLastGreaterThan5 = 1;
}elsif(($s[6] < 5) && $wasTheLastGreaterThan5){
# switched from greater to smaller, do something here.
$wasTheLastGreaterThan5 = 0;
}
else {
# Did not switch, if you need to count, you could do so here.
}
}