如何使用awk从文件中选择文本,从行号开始直到某个字符串

时间:2018-09-23 15:11:12

标签: bash awk sed tail

我有这个文件,要从某个行号开始读取,直到一个字符串为止。我已经用过

  

awk“ NR> = $ LINE && NR <= $((LINE + 121)){print}” db_000022_model1.dlg

从特定的行读取直到行号递增,但是现在我需要使其停止在某个字符串处,以便能够在其他文件上使用。

DOCKED: ENDBRANCH   7  22
DOCKED: TORSDOF 3
DOCKED: TER
DOCKED: ENDMDL

我希望它到达后停止

  

DOCKED:ENDMDL

#!/bin/bash

# This script is for extracting the pdb files from a sorted    list of scored
# ligands

mkdir top_poses

for d in $(head -20 summary_2.0.sort | cut -d, -f1 | cut -d/ -f1)
    do
    cd "$d"||continue
    # find the cluster with the highest population within the dlg
    RUN=$(grep '###*' "$d.dlg" | sort -k10 -r | head -1 | cut -d\| -f3 | sed 's/ //g')
    LINE=$(grep -ni "BEGINNING GENETIC ALGORITHM DOCKING $RUN of 100" "$d.dlg" | cut -d: -f1)
    echo "$LINE"
    # extract the best pose and correct the format
    awk -v line="$((LINE + 14))" "NR>=line; /DOCKED: ENDMDL/{exit}" "$d.dlg" | sed 's/^........//' > "$d.pdbqt"

    # convert the pdbqt file into pdb
    #obabel -ipdbqt $d.pdbqt -opdb -O../top_poses/$d.pdb
    cd ..
    done 

当我尝试

  

awk -v line =“ $(((LINE + 14))”“” NR> = line; / DOCKED:ENDMDL / {exit}“” $ d.dlg“ | sed's /^........//'>“ $ d.pdbqt”

就像在shell终端中一样,它可以工作。但是在脚本中,它会输出一个空文件。

1 个答案:

答案 0 :(得分:1)

取决于您处理目标行之前的DOCKED: ENDMDL的要求:

awk -v line="$LINE" 'NR>=line; /DOCKED: ENDMDL/{exit}' db_000022_model1.dlg

或:

awk -v line="$LINE" 'NR>=line{print; if (/DOCKED: ENDMDL/) exit}' db_000022_model1.dlg