使用awk脚本转换csv文件

时间:2019-10-08 01:12:13

标签: bash awk

我的csv文件如下:

C1, C2,   C3,Cv1,Cv2,Cv3,Cv4 ...  this one can be have longer column
x1, x2 ,x3.1, 1.1, 1.2, 1.3, 1.4
x1, x2, x3.2, 2.1, 2.2, 2.3, 2.4
x1, x2, x3.3, 3.1, 3.2, 3.3, 3.4

我想将此csv文件转换如下:

C1,C2,   C3,CTEXT,XVALUE
x1, x2, x3.1, Cv1 , 1.1
x1, x2, x3.1, Cv2 , 1.2
x1, x2, x3.1, Cv3 , 1.3
x1, x2, x3.1, Cv4 , 1.4
x1, x2, x3.2, Cv1 , 2.1
x1, x2, x3.2, Cv2 , 2.2
x1, x2, x3.2, Cv3 , 2.3
x1, x2, x3.2, Cv4 , 2.4
x1, x2, x3.3, Cv1 , 3.1
x1,x2,x3.3, Cv2 , 3.2
x1,x2,x3.3, Cv3 , 3.3
x1,x2,x3.3, Cv4 , 3.4

下面是我的代码:

#!/bin/bash
awk -F, -v OFS=, '{ if (NR==1)
{ print $1,$2,$3, "CTEXT","XVALUE"
  i=4; while (i < NF) {
   a[i]=$i; i=i+1
  }
  am=NF; next
}
i=4 ; while (i < am) {
  if (i > NF) {print "record "NR" insufficient value" >/dev/stderr
  break}
  print $1,$2,$3,a[i],$i
  i=i+1
  }
if (am <NF) print "record "NR" too many values for text" >/dev/stderr
}' input.csv

当我运行脚本时,它显示错误:

awk:第2行附近的语法错误 awk:在第2行附近救助


由埃德·莫顿(Ed Morton)编辑-我只是通过美化程序(gawk -o- '...')来运行脚本,所以它更容易阅读/理解:

{
    if (NR == 1) {
        print $1, $2, $3, "CTEXT", "XVALUE"
        i = 4
        while (i < NF) {
            a[i] = $i
            i = i + 1
        }
        am = NF
        next
    }
    i = 4
    while (i < am) {
        if (i > NF) {
            print("record " NR " insufficient value") > (/dev/) stderr
            break
        }
        print $1, $2, $3, a[i], $i
        i = i + 1
    }
    if (am < NF) {
        print("record " NR " too many values for text") > (/dev/) stderr
    }
}

2 个答案:

答案 0 :(得分:2)

即使将Solaris awk切换为gawk或nawk,仍然存在 仍然存在一些问题。请您尝试以下操作:

awk -F, -v OFS=, '
NR==1 {
    print $1,$2,$3, "CTEXT","XVALUE"
    for (i = 4; i <= NF; i++) a[i]=$i
    am=NF; next
}
{
    if (am < NF) {
        print "record "NR" too many values for text" > "/dev/stderr"
        next
    }
    for (i = 4; i <= am; i++) {
        if (i > NF) {
            print "record "NR" insufficient value" > "/dev/stderr"
            break
        }
        print $1,$2,$3,a[i],$i
    }
}' input.csv
  • 您需要将i递增到NRam(不是<而是<=)。
  • 用引号引起来的/dev/stderr
  • 最好使用for循环而不是while

希望这会有所帮助。

答案 1 :(得分:0)

类似的东西

$ awk -F, 'BEGIN {OFS=FS} 
           NR==1 {n=split($0,h); 
                  print $1,$2,$3,"CTEXT","XVALUE"; 
                  next} 
           n!=NF {print n<NF?"too many":"not enough"; 
                  exit} 
                 {for(i=4;i<=NF;i++) print $1,$2,$3,h[i],$i}' file

C1,C2,C3,CTEXT,XVALUE
x1,x2,x3.1,Cv1,1.1
x1,x2,x3.1,Cv2,1.2
x1,x2,x3.1,Cv3,1.3
x1,x2,x3.1,Cv4,1.4
x1,x2,x3.2,Cv1,2.1
x1,x2,x3.2,Cv2,2.2
x1,x2,x3.2,Cv3,2.3
x1,x2,x3.2,Cv4,2.4
x1,x2,x3.3,Cv1,3.1
x1,x2,x3.3,Cv2,3.2
x1,x2,x3.3,Cv3,3.3
x1,x2,x3.3,Cv4,3.4