根据前一行条件更改列

时间:2011-08-26 16:39:07

标签: bash awk

我的文件格式为:

ATOM   3736  CB  THR A 486      -6.552 153.891  -7.922  1.00115.15           C  
ATOM   3737  OG1 THR A 486      -6.756 154.842  -6.866  1.00114.94           O  
ATOM   3738  CG2 THR A 486      -7.867 153.727  -8.636  1.00115.11           C  
ATOM   3739  OXT THR A 486      -4.978 151.257  -9.140  1.00115.13           O  
HETATM10351  C1  NAG B 203      33.671  87.279  39.456  0.50 90.22           C  
HETATM10483  C1  NAG Z 702      28.025 104.269 -27.569  0.50 92.75           C    
ATOM   3736  CB  THR X 486      -6.552  86.240   7.922  1.00115.15           C  
ATOM   3737  OG1 THR X 486      -6.756  85.289   6.866  1.00114.94           O  
ATOM   3738  CG2 THR X 486      -7.867  86.404   8.636  1.00115.11           C  
ATOM   3739  OXT THR X 486      -4.978  88.874   9.140  1.00115.13           O  
HETATM10351  C1  NAG Y 203      33.671 152.852 -39.456  0.50 90.22           C  
HETATM10639  C2  FUC C 402     -48.168 162.221 -22.404  0.50103.03           C  

对于以HETATM *开头的每个行块,我想更改第5列以匹配前一个ATOM块的行。这意味着对于第一个HETATM *块,B和Z都将变为A,而对于第二个HETATM *块,Y和C都将变为X.

第二个问题,我真的不需要这样做,只是出于好奇,如何在每行以HETATM *开头后拆分文件,但前提是下一行是ATOM?

3 个答案:

答案 0 :(得分:2)

试试这个:

awk '{
  if( $1 == "ATOM" ) {
    col5=$5;
  } 
  else if( match($1,/HETATM[0-9]*/)) {
    $5=col5;
  }
  print 
}' < infile

答案 1 :(得分:2)

awk '$1=="ATOM"{c=$5}/^HETATM/{ $5=c };1' file

要保留空间,请使用字段分隔符

awk -F" " '/^ATOM/{c=$5}/^HETATM/{ $5=c };1' file

答案 2 :(得分:1)

这是我的解决方案,它解决了第一个问题(替换第五个字段),同时保留了空格:

$1=="ATOM" {
    fifthField=$5

    # Block to determine which index position field #5 is
    fifthField_index = 1
    for (i = 0; i < 4; i++) {
        // Skip until white space
        for (; substr($0, fifthField_index, 1) != " "; fifthField_index++) { }
        // Skip white spaces
        for (; substr($0, fifthField_index, 1) == " "; fifthField_index++) { }
    }

    print;next
}

/^HETATM/ {
    before_fifthField = substr($0, 1, fifthField_index - 1)
    after_fifthField = substr($0, fifthField_index + 1, length($0))
    print before_fifthField fifthField after_fifthField
    next
}

1

这不是最优雅的解决方案,但它确实有效。此解决方案假定第五个字段是单个字符。