我有一个序列和一个代表残留(字符)位置的数字。我想从残留物的每一侧取7个残基。这是执行此操作的代码:
my $seq = substr($sequence, $location-8, 14);
这从残留物的每一侧抓取7。然而,有一些序列的两侧少于7个残基。所以当发生这种情况时,我会收到错误消息:
substr outside of string at test9.pl line 52 (#1) (W substr)(F) You tried to reference a substr() that pointed outside of a string. That is, the absolute value of the offset was larger than the length of the string.
如何更改空位并将其替换为另一个字母(例如X)。
例如,如果有序列 ABCDEFGH 和$ location指向D,我需要每边7个,所以结果是: XXXXABCDEFGHXXX
答案 0 :(得分:2)
扩展我的评论。我会创建一个my_substr函数来封装填充和位置移位。
my $sequence = "ABCDEFGH";
my $location = 3;
sub my_substr {
my ($seq, $location, $pad_length) = @_;
my $pad = "X"x$pad_length;
return substr("$pad$seq$pad", $location, (2*$pad_length+1));
}
print my_substr($sequence, $location, 7) . "\n";
产量
XXXXABCDEFGHXXX
答案 1 :(得分:1)
这是一个非常详细的答案,但或多或少可以获得你想要的东西:
use strict;
use warnings;
my $sequence = 'ABCDEFGH';
my $wings = 7;
my $location = index $sequence, 'D';
die "D not found" if $location == -1;
my $start = $location - $wings;
my $length = 1 + 2 * $wings;
my $leftpad = 0;
if ($start < 0) {
$leftpad = -1 * $start;
$start = 0;
}
my $seq = substr($sequence, $start, $length);
$seq = ('X' x $leftpad) . $seq if $leftpad;
my $rightpad = $length - length ($seq);
$seq .= 'X' x $rightpad if $rightpad > 0;
print $seq;
或者为了避免所有额外的工作,可以创建一个包含填充的新$ sequence变量:
my $sequence = 'ABCDEFGH';
my $wings = 7;
my $location = index $sequence, 'D';
die "D not found" if $location == -1;
my $paddedseq = ('X' x $wings) . $sequence . ('X' x $wings);
my $seq = substr($paddedseq, $location, 1 + 2 * $wings);
print $seq;