Bash / Awk / Sed捕获每个文件中的最后第三行和第四行信息

时间:2014-05-26 05:43:15

标签: bash awk sed

我们正在尝试使用awk / bash / sed来获取多个文件(input1.dat,input2.dat,input3.dat)的“DV / DL”的最后第三和第四个值,并逐步打印出DV / DL来自“input1.dat”,“input1.dat input2.dat”,“input1.dat input2.dat input3.dat”等。

“input1.dat”输出将是

"the fourth last DV/DL in input1.dat"  "the third last DV/DL in input1.dat"

“input1.dat input2.dat”输出将是

"the fourth last DV/DL in input1.dat"  "the third last DV/DL in input1.dat"
"the fourth last DV/DL in input2.dat"  "the third last DV/DL in input2.dat"

示例文件(input1.dat)如下所示。在input1.dat中我们想打印出“DV / DL”为“-26.2720 2.6879”。

NSTEP =    50000   TIME(PS) =     450.000  TEMP(K) =   299.02  PRESS =   213.4
Etot   =    -80270.1079  EKtot   =     33984.2399  EPtot      =   -114254.3478
BOND   =     22963.8665  ANGLE   =      2030.3803  DIHED      =      4232.0101
1-4 NB =       953.1576  1-4 EEL =     12610.8610  VDWAALS    =     20829.7086
EELEC  =   -177874.3319  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =       -26.2168
EKCMT  =     10196.0704  VIRIAL  =      8501.7871  VOLUME     =    367647.0872
                                                Density    =         1.0555
Ewald error estimate:   0.3945E-04
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  A V E R A G E S   O V E R   50000 S T E P S


NSTEP =    50000   TIME(PS) =     450.000  TEMP(K) =   299.79  PRESS =     2.0
Etot   =    -80333.0001  EKtot   =     34071.3290  EPtot      =   -114404.3291
BOND   =     22903.8667  ANGLE   =      2081.5119  DIHED      =      4220.2140
1-4 NB =       932.2502  1-4 EEL =     12548.7169  VDWAALS    =     20862.9158
EELEC  =   -177953.8046  EHBOND  =         0.0000  RESTRAINT  =         0.0000
**DV/DL  =       -26.2720**
EKCMT  =     10189.4552  VIRIAL  =     10173.6800  VOLUME     =    367385.1338
                                                Density    =         1.0562
Ewald error estimate:   0.4704E-04
------------------------------------------------------------------------------


R M S  F L U C T U A T I O N S


NSTEP =    50000   TIME(PS) =     450.000  TEMP(K) =     1.28  PRESS =   145.5
Etot   =       231.2453  EKtot   =       145.6638  EPtot      =       184.9663
BOND   =       151.1551  ANGLE   =        35.0649  DIHED      =        19.6523
1-4 NB =        12.6972  1-4 EEL =        33.8995  VDWAALS    =       174.9784
EELEC  =       305.7204  EHBOND  =         0.0000  RESTRAINT  =         0.0000
**DV/DL  =         2.6879**
EKCMT  =        78.6868  VIRIAL  =      1152.6836  VOLUME     =       406.3321
                                                Density    =         0.0012
Ewald error estimate:   0.3520E-04
------------------------------------------------------------------------------


**DV/DL**, AVERAGES OVER   50000 STEPS


NSTEP =    50000   TIME(PS) =     450.000  TEMP(K) =     0.00  PRESS =     0.0
Etot   =         0.0000  EKtot   =         0.0000  EPtot      =       -26.2720
BOND   =         0.0000  ANGLE   =         0.0000  DIHED      =         0.0000
1-4 NB =         0.0000  1-4 EEL =        -8.4978  VDWAALS    =         0.0000
EELEC  =       -17.7743  EHBOND  =         0.0000  RESTRAINT  =         0.0000
DV/DL  =       -26.2720
Ewald error estimate:   0.0000E+00


    ___________________

以下是我们的第一次尝试。

for X in 1 2 3

do

var1='awk -f capture_dvdl.awk'
var2="$var2 input${X}.dat"
var3=">> output${X}.dat"

echo "$var1 $var2 $var3"  > average_preparation.sh

sh average_preparation.sh

done

capture_dvdl.awk就在这里

#!/bin/awk

BEGIN{}
{
if ($1 ~ /^DV\/DL/ && FNR == 2603) {

printf("%14.4f",$3)

 }

 }

 {
 if ($1 ~ /^DV\/DL/ && FNR == 2618) {

 printf("%14.4f\n",$3)

 }

 }

这将生成average_preparation.sh为

awk -f capture_dvdl.awk input1.dat > output1.dat

awk -f capture_dvdl.awk input1.dat input2.dat > output2.dat

然而,在2603和2618的行中不需要最后的第三和第四“DV / DL”,所以这个技巧不起作用。

第二个技巧是这个

 awk '$1 ~ /^DV\/DL/ {printf("%14.4f\n",$3)}' input1.dat input2.dat input3.dat | tail -n 4 | head -n 2 > output.dat 

然而,这个只打印出input3.dat的最后第三和第四个DV / DL。

不知道是否有任何大师可以提出任何评论?谢谢!

3 个答案:

答案 0 :(得分:1)

根据问题的准确措辞(“第三和第四个最后”)以及显示目标文本从第1列开始的示例代码,这可能有效:

for f in input1.dat input2.dat input3.dat; do
  echo $(awk '/^DV\/DL/{printf "%14.4f\n",$3}' $f | tail -n4 | head -n2)
done

如果文件名很容易使用模式生成,如上所述,您可以使用:

for f in input?.dat; do
  echo $(awk '/^DV\/DL/{printf "%14.4f\n",$3}' $f | tail -n4 | head -n2)
done

如果您希望以相反的顺序打印,请在| tac

之后添加head -n2

答案 1 :(得分:1)

这听起来像是一个非常简单的问题,以非常复杂的方式描述。让我们从这开始:

$ awk '/DV\/DL.*=/{val[++cnt]=$NF} END{for (i=1;i<=cnt;i++) print i, val[i]}' file
1 -26.2168
2 -26.2720
3 2.6879
4 -26.2720

$ awk '/DV\/DL.*=/{val[++cnt]=$NF} END{print "third:", val[3]}' file
third: 2.6879

$ awk '/DV\/DL.*=/{val[++cnt]=$NF} END{print "third-last:", val[cnt-3]}' file
third-last: -26.2168

$ awk '/DV\/DL.*=/{val[++cnt]=$NF} END{print "last:", val[cnt]}' file
last: -26.2720

现在 - 您需要做些什么,为什么?

以上内容是根据您发布的输入文件生成的,其中**已从DV / DL行中移除,因为我认为您添加了该文件以尝试使文字变为粗体(如果是,请将其删除为&#39 ; s使事情更加令人困惑)。

答案 2 :(得分:0)

第4次要求

sed -n '\|^DV/DL| H
$ {x;s|\nDV/DL *= *|;|g;s/\([^;]*;\)\{2\}\([^;]*\);\([^;]*\);.*/\3 \2/p;}' YourFile
2.6879 -26.2720

3和4喜欢你的样本

sed -n '\|^DV/DL| H
$ {x;s|\nDV/DL *= *|;|g;s/\([^;]*;\)\{2\}\([^;]*\);\([^;]*\);.*/\2 \3/p;}' YourFile
-26.2720 2.6879