用于提取数据的Perl脚本

时间:2014-07-31 06:57:07

标签: perl

我有一个这样的文件,如下所示。

chr10   299448  299468  SRR048973.1457734       255     +       3
chr10   299448  299468  SRR048973.2114188       255     +       3
chr10   299448  299468  SRR048973.4148128       255     +       3
chr10   299945  299971  SRR048973.566192        255     +       6
chr10   299959  299982  SRR048973.762883        255     +       6
chr10   299968  299985  SRR048973.1595367       255     +       6
chr10   299968  299985  SRR048973.2828877       255     +       6
chr10   299968  299985  SRR048973.3711952       255     +       6
chr10   299968  299985  SRR048973.3821978       255     +       6
chr10   300073  300095  SRR048973.975870        255     +       1
chr10   300109  300134  SRR048973.1500469       255     +       1
chr10   300185  300209  SRR048973.655183        255     +       8
chr10   300185  300209  SRR048973.933425        255     +       8
chr10   300185  300209  SRR048973.963046        255     +       8
chr10   300185  300209  SRR048973.3506573       255     +       8
chr10   300185  300209  SRR048973.3627590       255     +       8
chr10   300186  300209  SRR048973.1133369       255     +       8
chr10   300186  300209  SRR048973.2178421       255     +       8
chr10   300186  300209  SRR048973.4047933       255     +       8
chr10   300401  300426  SRR048973.918503        255     +       2
chr10   300401  300426  SRR048973.2870188       255     +       2

查看最后一列,如果最后一列是> = 5,那么我想计算直到该列大于5的行,直到它回落到< 5。 5.对于样本文件

,我想要的输出也应如此
chr10   299945  299985   6
chr10   300185  300209   8

299945来自第2列,其中前6个开始,299985来自第3列,其中最后6个结束。同样适用于8。

我想在Perl中这样做。

我尝试编写Perl脚本,但无法理解如何正确获取坐标。

#!/usr/bin/perl-w
use strict;
use warnings;

open F,'/user/tmp/output.bed',or die $!;

my $i=0;
while(<F>){
        chomp;
        my @s = split;
        if($s[6] >= 5){
                $i++;
        }else{
                if($s[6] < 5){
                $i = 0;
                }
        }

}

我该怎么做。

提前致谢

此致

2 个答案:

答案 0 :(得分:1)

使用range operator

use strict;
use warnings;

my @last;

while (<DATA>) {
    my @cols = split ' ';
    if (my $range = $cols[-1] >= 5 .. $cols[-1] < 5 || eof) {
        @last = @cols[0..2,-1] if $range == 1;
        print "@last\n" if $range =~ /E/;
        $last[2] = $cols[2];
    }
}


__DATA__
chr10   299448  299468  SRR048973.1457734       255     +       3
chr10   299448  299468  SRR048973.2114188       255     +       3
chr10   299448  299468  SRR048973.4148128       255     +       3
chr10   299945  299971  SRR048973.566192        255     +       6
chr10   299959  299982  SRR048973.762883        255     +       6
chr10   299968  299985  SRR048973.1595367       255     +       6
chr10   299968  299985  SRR048973.2828877       255     +       6
chr10   299968  299985  SRR048973.3711952       255     +       6
chr10   299968  299985  SRR048973.3821978       255     +       6
chr10   300073  300095  SRR048973.975870        255     +       1
chr10   300109  300134  SRR048973.1500469       255     +       1
chr10   300185  300209  SRR048973.655183        255     +       8
chr10   300185  300209  SRR048973.933425        255     +       8
chr10   300185  300209  SRR048973.963046        255     +       8
chr10   300185  300209  SRR048973.3506573       255     +       8
chr10   300185  300209  SRR048973.3627590       255     +       8
chr10   300186  300209  SRR048973.1133369       255     +       8
chr10   300186  300209  SRR048973.2178421       255     +       8
chr10   300186  300209  SRR048973.4047933       255     +       8
chr10   300401  300426  SRR048973.918503        255     +       2
chr10   300401  300426  SRR048973.2870188       255     +       2

输出:

chr10 299945 299985 6
chr10 300185 300209 8

答案 1 :(得分:0)

你需要计算吗?你的输出似乎没有包含它...

使用您的代码示例:

#!/usr/bin/perl-w
use strict;
use warnings;

open F,'/user/tmp/output.bed',or die $!;

my $i=0;
my $wasTheLastGreaterThan5 = 0;
while(<F>){
    chomp;
    my @s = split;
    if(($s[6] >= 5) && !$wasTheLastGreaterThan5){
        # Switched from smaller to greater than 5, do something here.

        $wasTheLastGreaterThan5 = 1;

    }elsif(($s[6] < 5) && $wasTheLastGreaterThan5){
        # switched from greater to smaller, do something here.

        $wasTheLastGreaterThan5 = 0;

    }
    else {
        # Did not switch, if you need to count, you could do so here.
    }
}