从Perl中包含少于15个字符的字符串中获取15个字符

时间:2014-03-15 22:06:08

标签: perl

我有一个序列和一个代表残留(字符)位置的数字。我想从残留物的每一侧取7个残基。这是执行此操作的代码:

my $seq = substr($sequence, $location-8, 14);

这从残留物的每一侧抓取7。然而,有一些序列的两侧少于7个残基。所以当发生这种情况时,我会收到错误消息:

substr outside of string at test9.pl line 52 (#1) (W substr)(F) You tried to reference a substr() that pointed outside of a string.  That is, the absolute value of the offset was larger than the length of the string.

如何更改空位并将其替换为另一个字母(例如X)。

例如,如果有序列 ABCDEFGH 和$ location指向D,我需要每边7个,所以结果是: XXXXABCDEFGHXXX

2 个答案:

答案 0 :(得分:2)

扩展我的评论。我会创建一个my_substr函数来封装填充和位置移位。

my $sequence = "ABCDEFGH";
my $location = 3;

sub my_substr {
 my ($seq, $location, $pad_length) = @_;
 my $pad = "X"x$pad_length;
 return substr("$pad$seq$pad", $location, (2*$pad_length+1));
}

print my_substr($sequence, $location, 7) . "\n";

产量

XXXXABCDEFGHXXX

答案 1 :(得分:1)

这是一个非常详细的答案,但或多或​​少可以获得你想要的东西:

use strict;
use warnings;

my $sequence = 'ABCDEFGH';
my $wings = 7;

my $location = index $sequence, 'D';
die "D not found" if $location == -1;

my $start = $location - $wings;
my $length = 1 + 2 * $wings;

my $leftpad = 0;
if ($start < 0) {
    $leftpad = -1 * $start;
    $start = 0;
}
my $seq = substr($sequence, $start, $length);
$seq = ('X' x $leftpad) . $seq if $leftpad;

my $rightpad = $length - length ($seq);
$seq .= 'X' x $rightpad if $rightpad > 0;

print $seq;

或者为了避免所有额外的工作,可以创建一个包含填充的新$ sequence变量:

my $sequence = 'ABCDEFGH';
my $wings = 7;

my $location = index $sequence, 'D';
die "D not found" if $location == -1;

my $paddedseq = ('X' x $wings) . $sequence . ('X' x $wings);
my $seq = substr($paddedseq, $location, 1 + 2 * $wings);

print $seq;