我正在尝试仅打印预定义的序列(ATOM名称)但未获得预期的输出。我想按照以下预期输出打印输入文件。链ID可以是A到H.
代码:
my $OutputDir = 'C:\test_result_file';
open my $dir, "Document1.txt" or die "Failed to open Document1.txt:$!";
chomp(my @files = <$dir>);
foreach my $file (@files) {
my $win_len = 4;
my @window = ();
my $prev_chain = "";
open my $input, $file or die "failed to open $file: $!\n";
open my $output, '>', "$OutputDir/$file" or die "failed to open $OutputDir/$file.pdb: $!\n";
while (<$input>) {
my ($atom_name, $chain) = (split)[2, 4];
next unless $atom_name =~ /\b(?:C4B|O4B|C1B|C2B|O4B|C1B|C2B|C3B|C1B|C2B|C3B|C4B|C2B|C3B|C4B|O4B|C3B|C4B|O4B|C1B)\b/;
if ($chain eq $prev_chain) {
if (@window == $win_len) {
print_window($output, @window);
shift @window;
}
push @window, $_;
} else {
print_window($output, @window) if @window;
@window = ($_);
$prev_chain = $chain;
}
}
print_window($output, @window) if @window;
}
sub print_window {
my $fh = shift;
print $fh $_ foreach @_;
print $fh "\n";
}
输入文件:
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
预期产出:
链条:
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
B链:
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10958 C2B NAD B 363 42.883 -56.336 140.942 1.00 38.13 C
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10956 C3B NAD B 363 42.061 -55.476 141.894 1.00 37.13 C
HETATM10954 C4B NAD B 363 41.496 -54.407 140.932 1.00 39.26 C
HETATM10955 O4B NAD B 363 41.936 -54.715 139.568 1.00 41.96 O
HETATM10960 C1B NAD B 363 42.233 -56.127 139.593 1.00 42.92 C
描述:我想对HETATM预定义的ATOM名称进行排序(例如:C4B,O4B,C1B,C2B等)。到目前为止我有上面的脚本。所以请任何人帮我解决这个问题。在我当前的脚本中,我得到相同的格式但无法获得预期的结果。
我不想要A链和B链或任何链ID的单独文件。我想根据我的序列(预定义)对ATOM名称进行排序。
我的序列是:
C4B-O4B-C1B-C2B
O4B-C1B-C2B-C3B
C1B-C2B-C3B-C4B
C2B-C3B-C4B-O4B
C3B-C4B-O4B-C1B
e.g., first row: C4B
HETATM10910 C4B NAD A 363 60.856 -58.575 149.282 1.00 40.44 C
Second row: O4B
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
Third Row: C1B
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
Fourth Row: C2B
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
Fifth Row: O4B
HETATM10911 O4B NAD A 363 61.320 -59.488 148.275 1.00 43.48 O
Sixth Row: C1B
HETATM10916 C1B NAD A 363 61.394 -58.766 147.056 1.00 43.29 C
Seventh Row: C2B
HETATM10914 C2B NAD A 363 60.167 -57.970 147.054 1.00 40.90 C
Eighth Row: C3B
HETATM10912 C3B NAD A 363 60.243 -57.426 148.473 1.00 40.37 C
.
.
.
so on
B和其他链的格式也相同。
这意味着我需要多次每行。所有关闭原子名称应该在输入文件和链方式。我们需要复制以上所有原子名称文件,然后我们需要按照上面的顺序粘贴。
答案 0 :(得分:0)
在我看来,你的错误来自这一行:
my ($atom_name, $chain) = (split)[2, 4];
这会将第3列放在$atom_name
中,将第5列放在$chain
中。
我想你想要:
my ($atom_name, $chain) = (split)[1, 3];
您将获得第一行:
$atom_name = C4B
和$chain = B