我的文件格式为:
ATOM 3736 CB THR A 486 -6.552 153.891 -7.922 1.00115.15 C
ATOM 3737 OG1 THR A 486 -6.756 154.842 -6.866 1.00114.94 O
ATOM 3738 CG2 THR A 486 -7.867 153.727 -8.636 1.00115.11 C
ATOM 3739 OXT THR A 486 -4.978 151.257 -9.140 1.00115.13 O
HETATM10351 C1 NAG B 203 33.671 87.279 39.456 0.50 90.22 C
HETATM10483 C1 NAG Z 702 28.025 104.269 -27.569 0.50 92.75 C
ATOM 3736 CB THR X 486 -6.552 86.240 7.922 1.00115.15 C
ATOM 3737 OG1 THR X 486 -6.756 85.289 6.866 1.00114.94 O
ATOM 3738 CG2 THR X 486 -7.867 86.404 8.636 1.00115.11 C
ATOM 3739 OXT THR X 486 -4.978 88.874 9.140 1.00115.13 O
HETATM10351 C1 NAG Y 203 33.671 152.852 -39.456 0.50 90.22 C
HETATM10639 C2 FUC C 402 -48.168 162.221 -22.404 0.50103.03 C
对于以HETATM *开头的每个行块,我想更改第5列以匹配前一个ATOM块的行。这意味着对于第一个HETATM *块,B和Z都将变为A,而对于第二个HETATM *块,Y和C都将变为X.
第二个问题,我真的不需要这样做,只是出于好奇,如何在每行以HETATM *开头后拆分文件,但前提是下一行是ATOM?
答案 0 :(得分:2)
试试这个:
awk '{
if( $1 == "ATOM" ) {
col5=$5;
}
else if( match($1,/HETATM[0-9]*/)) {
$5=col5;
}
print
}' < infile
答案 1 :(得分:2)
awk '$1=="ATOM"{c=$5}/^HETATM/{ $5=c };1' file
要保留空间,请使用字段分隔符
awk -F" " '/^ATOM/{c=$5}/^HETATM/{ $5=c };1' file
答案 2 :(得分:1)
这是我的解决方案,它解决了第一个问题(替换第五个字段),同时保留了空格:
$1=="ATOM" {
fifthField=$5
# Block to determine which index position field #5 is
fifthField_index = 1
for (i = 0; i < 4; i++) {
// Skip until white space
for (; substr($0, fifthField_index, 1) != " "; fifthField_index++) { }
// Skip white spaces
for (; substr($0, fifthField_index, 1) == " "; fifthField_index++) { }
}
print;next
}
/^HETATM/ {
before_fifthField = substr($0, 1, fifthField_index - 1)
after_fifthField = substr($0, fifthField_index + 1, length($0))
print before_fifthField fifthField after_fifthField
next
}
1
这不是最优雅的解决方案,但它确实有效。此解决方案假定第五个字段是单个字符。