在awk中,搜索某些行的某些列

时间:2016-05-08 21:17:36

标签: bash awk

我有一个名为c_FROM_V_273_008245_50_neighbours_SYMREMO.out的文件,如下所示:

NEIGHBORS OF THE NON-EQUIVALENT ATOMS

N = NUMBER OF NEIGHBORS AT DISTANCE R
ATOM  N     R/ANG      R/AU   NEIGHBORS (ATOM LABELS AND CELL INDICES)
1 CA   1     2.4055     4.5458    7 O    0 0 0
1 CA   1     2.4058     4.5463   10 O    0-1 0
1 CA   1     2.4356     4.6026   14 O    0 0 0
.
.
.

如果我想在R/ANG 1 CA 7 O中搜索距离,则为2.4055

我创建了这个脚本: search_for_distance.awk

 {if ($0 ~ "NEIGHBORS OF THE NON-EQUIVALENT ATOMS") {FLAG=1}};
 # If the current line of the file begins with that string, we asign it a FLAG=1

    {if (FLAG==1)
            {if ($0 ~ "^   1 CA"){LINE=$0;
            exit}
            }
    };
    # Here I am searching for "1 CA" on each line

 END{VOL=FILENAME;
 # The filename is: "c_FROM_V_273_008245_50_neighbours_SYMREMO.out"
 # My intention is to end up with a new file with 2 columns:
 # "volume" and "distance". 
 # Notice that the filename contains the volume: 273.008245

 gsub("^.*_V_","",VOL);
 gsub("_",".",VOL);
 gsub(".50.neighbours.SYMREMO.out"," ",VOL);
 # Some substitutions to make "c_FROM_V_273_008245_50_neighbours_SYMREMO.out" 
 # to be "273.008245"

 # Up to now the output of running: 
 # search_for_distance.awk -f c_FROM_V_273_008245_50_neighbours_SYMREMO.out
 # is the following:

 # 273.008245     1 CA   1     2.4055     4.5458    7 O    0 0 0

 # So, I need to take LINE and only extract column "4".
 # This is done by a "split" command:

 {split(LINE,array," ")}   

 print VOL,array[4]}

运行的输出: search_for_distance.awk -f c_FROM_V_273_008245_50_neighbours_SYMREMO.out 如下:

 273.008245  2.4055

请注意,该脚本正在打印1 CA的第一个外观,恰好是1 CA 7O,这正是我想要的。

但是现在我需要运行这个搜索第一个外观很多距离......

我想搜索1 CA 14 O距离的第一次出现。 我只需要修改我从行首开始搜索的代码的第一位到1 CA

 {if ($0 ~ "NEIGHBORS OF THE NON-EQUIVALENT ATOMS") {FLAG=1}};
 # If the current line begins with that string, we asign it a FLAG=1

    {if (FLAG==1)
            {if ($0 ~ "^   1 CA"){LINE=$0;
            exit}
            }
    };

我如何引入订单来搜索1 CA 14 O

这样的东西
    {if (FLAG==1)
            {if ($0 ~ "/1 CA   && /14 O"){LINE=$0;
            exit}
            }
    };

非常感谢您的帮助

1 个答案:

答案 0 :(得分:2)

  

我想在R / ANG中搜索1 CA 7 O的距离,在本例中为2.4055

$ awk '$1==1 && $2=="CA" && $6==7 && $7=="O" {print $4}' file
2.4055

找到1 CA 14 O的R / Ang:

$ awk '$1==1 && $2=="CA" && $6==14 && $7=="O" {print $4}' file
2.4356

如何运作

  • $1==1 && $2=="CA" && $6==7 && $8==0

    这将选择四个陈述条件为真的行。

  • print $4

    对于选定的行,这将打印第四个字段。