使用awk以csv格式格式化数据

时间:2017-04-11 04:37:03

标签: bash awk scripting

由于我是这类工作的新手,因此需要一些专家建议来完成这项任务或任何最好的方法来完成它。

以下是提取的原始数据样本。

[root@dcconnect ~]# cat 11.csv
  uk1
  zone1-groupa   :   others=413/600(68.8%)   rhel=8/360(2.2%)   windows=74/300(24.7%)
  au1
  zone1-groupa   :   oracle-se2-rhel=0/60(0.0%)   others=166/240(69.2%)   rhel=4/360(1.1%)   windows=27/180(15.0%)
  de1
  zone1-groupa   :   oracle-se2-rhel=0/60(0.0%)   others=204/240(85.0%)   rhel=33/360(9.2%)   windows=106/180(58.9%)
  jp2
  zone1-groupa   :   others=641/780(82.2%)   rhel=223/420(53.1%)   windows=517/660(78.3%)
  zone1-groupb   :   oracle-se2=44/60(73.3%)   others=557/900(61.9%)   rhel=312/420(74.3%)   windows=163/600(27.2%)
  hk1
  zone1-groupa   :   oracle-se2-rhel=2/60(3.3%)   others=215/480(44.8%)   rhel=12/300(4.0%)   windows=172/360(47.8%)
  us1
  zone1-groupa   :   oracle-se2-rhel=1/60(1.7%)   others=325/480(67.7%)   rhel=5/300(1.7%)   windows=36/360(10.0%)
  zone1-groupb   :   others=76/480(15.8%)   rhel=1/480(0.2%)   windows=8/480(1.7%)
  sg1
  zone1-groupa   :   oracle-se2-rhel=19/60(31.7%)   others=390/480(81.3%)   rhel=84/360(23.3%)   windows=165/360(45.8%)
  zone1-groupb   :   others=81/480(16.9%)   rhel=33/480(6.9%)   windows=11/480(2.3%)
  jp1
  zone1-groupa   :   oracle-ee=12/60(20.0%)   others=4600/4680(98.3%)   rhel=914/1080(84.6%)   windows=2028/2100(96.6%)
  zone1-groupb   :   oracle-ee=9/60(15.0%)   oracle-se2=137/180(76.1%)   others=2409/2520(95.6%)   rhel=236/1440(16.4%)   windows=491/960(51.1%)

需要帮助才能获得以下格式或“:”,可以在excel中分隔。

de1 zone1-groupa    oracle-se2-rhel 0   60  0.0%   
                    others  204 240 85.0%   
                    rhel    33  360 9.2%    
                    windows 106 180 58.9%
jp2 zone1-groupa    others  641 780 82.2%   
                    rhel    223 420 53.1%   
                    windows 517 660 78.3%
    zone1-groupb    oracle-se2  44  60  73.3%   
                    others  557 900 61.9%    
                    rhel    312 420 74.3%   
                    windows 163 600 27.2%

数据非常庞大,需要每天3次收集。原始数据在Linux系统中生成。

1 个答案:

答案 0 :(得分:1)

以下产生与所需输出匹配的TSV(制表符分隔)输出(在控制台中打印时,它不会完全对齐,但Excel应该能够按照预期的字段读取它):

awk '
  NF==1 { printf "%s", $1; next }
  { 
    printf "\t%s", $1
    sep="\t"
    for (i=3; i<=NF; ++i) {
      gsub("[=/()]", "\t", $i)
      printf "%s%s", sep, $i
      sep="\n\t\t"
    }
    printf "\n"
  }
' 11.csv