Awk - 从逗号分隔文件中提取和分隔特定列中的数据并对其进行分组

时间:2016-05-19 14:18:33

标签: awk

我知道在提取时有多个帖子但我正在做的是从CSV分隔文件中提取特定列并将两个字段相加。我想在组中添加一个DATE列,但该列为MM/DD/YYYY HH:MM:SS,我需要仅提取该组的日期。

示例输入:

Column1,Column2,Column3,Column4,Column5,Column6,Column7,Column8,Column9,Column10
1/1/2016 9:05:01,O1234,APPLE,10,1.01,AAAA,BBBB,CCCC,DDDD,EEEE
1/1/2016 10:05:01,O1234,APPLE,5,0.99,AAAA,BBBB,CCCC,DDDD,EEEE

我的代码:

awk -F',' -v OFS=',' '
   (NR!=1) {
       a[$2","$3","$9","$10]+=$4;
       b[$2","$3","$9","$10]+=$5;
       c[$2","$3","$9","$10]+=($4*$5)
   }
   END {
       for(i in a){print i,a[i],b[i],c[i]}
   }
' data.txt >aa.txt

我需要编辑此声明,以便我可以在2016年1月1日对整个字符串的$ 1进行分组。

awk -F',' -v OFS=',' '
    (NR!=1) {
        a[$1","$2","$3","$9","$10]+=$4;
        b[$1","$2","$3","$9","$10]+=$5;
        c[$1","$2","$3","$9","$10]+=($4*$5)
    }
    END {
        for(i in a){print i,a[i],b[i],c[i]}
    }
' data.txt >aa.txt

预期产出:

1/1/2016,O1234,AAPL,DDDD,EEEE,15,2.00,15.05

1 个答案:

答案 0 :(得分:1)

$ cat tst.awk
BEGIN { FS=OFS="," }
NR>1 {
    sub(/ .*/,"",$1)
    k = $1 FS $2 FS $3 FS $9 FS $10
    a[k] += $4
    b[k] += $5
    c[k] += ($4*$5)
}
END {
    for (k in a) {
        print k, a[k], b[k], c[k]
    }
}

$ awk -f tst.awk file
1/1/2016,O1234,APPLE,DDDD,EEEE,15,2,15.05