我有一个可用的脚本代码,但是如何使这个脚本代码更加优雅"?

时间:2015-01-26 12:15:37

标签: awk

一些背景知识。我有两个文件(A和B),其中包含我需要提取的数据。

对于文件A,我只需要最后两行,如下所示:

RMM:  17    -0.221674395053E+01    0.59892E-04    0.00000E+00    31   0.259E-03
    1 F= -.22167440E+01 E0= -.22167440E+01  d E =-.398708E-10  mag=     2.0000

我需要提取以下数字:

-1st Line, 2nd field (17)
-1st Line 4th field (0.59892E-04)
-2nd Line, 1st field (1)
-2nd Line, 3rd field (-.22167440E+01)
-2nd Line, 5th field (-.22167440E+01)
-2nd Line, 8th field (-.398708E-10)
-2nd Line, 10th field (2.0000)

对于文件B,我只需要最后11行,如下所示:

                  Total CPU time used (sec):        0.364
                        User time (sec):        0.355
                      System time (sec):        0.009
                     Elapsed time (sec):        1.423

               Maximum memory used (kb):        9896.
               Average memory used (kb):           0.

                      Minor page faults:         2761
                      Major page faults:            4
             Voluntary context switches:           24

我需要提取以下数字:

 -1st line, 6th field (0.364)
 -2nd line, 4th field (0.355)
 -3rd line, 4th field (0.009)
 -4th line, 4th field (1.423)
 -6th line, 5th field (9896.)
 -7th line, 5th field (0.)

我的输出应该是这样的:

mainfolder1[tab/space]subfolder1[tab/space][all the extracted info separated by tab]
mainfolder2[tab/space]subfolder2[tab/space][all the extracted info separated by tab]
mainfolder3[tab/space]subfolder3[tab/space][all the extracted info separated by tab]
...
mainfoldern[tab/space]subfoldern[tab/space][all the extracted info separated by tab]

现在这是我的脚本代码:

for m in ./*/; do
main=$(basename "$m")
for s in "$m"*/; do
    sub=$(basename "$s")
vdata=$(tail -n2 ./$main/$sub/A | awk -F'[ =]+' NR==1'{a=$2;b=$4;next}{print s,a,$2,$4,$6,$9, $11}')
ctime=$(tail -n11 ./$main/$sub/B |head -n1|awk '{print $6}')
utime=$(tail -n10 ./$main/$sub/B |head -n1|awk '{print $4}')
stime=$(tail -n9 ./$main/$sub/B |head -n1|awk '{print $4}')
etime=$(tail -n8 ./$main/$sub/B |head -n1|awk '{print $4}')
maxmem=$(tail -n6 ./$main/$sub/B |head -n1|awk '{print $5}')
avemem=$(tail -n5 ./$main/$sub/B |head -n1|awk '{print $5}')
c=$(echo $sub| cut -c 2-)
    echo "$m $c $vdata $ctime $utime $stime $etime $maxmem $avemem"
done
done > output

现在,第四行,即vdata部分,实际上是来自前一个论坛问题的“回收”行。我不完全理解它。我希望我的文件B代码与文件A的awk代码一样优雅。我该怎么做?谢谢! :)

3 个答案:

答案 0 :(得分:0)

对于文件B,尝试类似:

tail -n11 B | awk -F':' '{ print $2 }'

如果您需要保留值然后回显,您可以执行以下操作:

array=($(tail -n11 B | awk -F':' '{ print $2 }'))
for value in "${array[@]}"
do
    echo $value
done

答案 1 :(得分:0)

您应该查看findxargs,因为每次在shell中编写循环只是为了操作文本时你有错误的方法但是要保持简单并保留原始结构,听起来像你可以使用类似的东西:

for m in ./*/; do
  main=$(basename "$m")
  for s in "$m"*/; do
    sub=$(basename "$s")
    fileA="${main}/${sub}/A"
    fileB="${main}/${sub}/B"
    awk -v sizeA=$(wc -l < "$fileA") -v sizeB=$(wc -l < "$fileB") '
        NR==FNR {
            if ( FNR == (sizeA-1) ) { split($0,p) }
            if ( FNR == sizeA )     { split($0,a) }
            next
        }
        { b[sizeB + 1 - FNR] = $NF }
        END {
            split(FILENAME,f,"/")
            print f[1], f[2], p[2], p[4], a[1], a[3], a[5], a[8], a[10], b[11], b[10], b[9], b[8], b[6], b[5]
        }
    ' "$fileA" "$fileB"
  done
done > output

请注意,上面只打开每个“B”文件1次而不是6次。

答案 2 :(得分:0)

awk 'NR==1{print $6} NR==2{print $4} NR==3{print $4} ...'

您可以通过以下方式简化:

NR==2 || NR==3 || NR==4

但这似乎难以维持。或者您可以使用数组:

awk 'BEGIN{a[1]=6;a[2]=4...} NR in a{ print $a[NR]}'

但我认为你真的只是想要:

awk '{print $NF}' ORS=\\t

(你不想要第1行的第6个字段。你想要最后一个字段。)

不是试图将输出收集到变量中以便回显,而是添加ORS=\\t以获得制表符分隔输出,然后让它打印到脚本的stdout。