使用awk格式化和替换timestamp列

时间:2017-05-18 04:26:53

标签: bash shell unix awk

我有多列格式如下

D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"

我想将它们格式化如下

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

我尝试使用awk -F“[,/:]”分离列,然后根据长度进行处理

但是当有多列时,它会变得单调乏味。

请在awk中建议是否有任何日期时间或时间戳格式化选项,以便我可以按列快速处理,这将是快速的

4 个答案:

答案 0 :(得分:4)

$ cat tst.awk
function fmt(t,    f) {
    split(t,f,/["\/ :]/)
    return sprintf("\"%02d/%02d/%04d %02d:%02d:%02d %s\"",f[2],f[3],f[4],f[5],f[6],f[7],f[8])
}
BEGIN { FS=OFS="," }
{ $2=fmt($2); $4=fmt($4); print }

$ awk -f tst.awk file
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

答案 1 :(得分:3)

我建议使用awk及其printf格式化输出:

awk -F '["/ :]' '{printf "%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16}' file

输出:

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

答案 2 :(得分:0)

使用GNU awk(splitseps)。代码:

function doit(str,    b) {                      # b is a local var buffer
    gsub(/\"/,"",str);                          # remove quotes
    n=split(str,a,"[/ :]",seps);                # split on special chars
    for(j=1;j<=n;j++) {                         # loop all elements in a
        if(a[j]~/^[0-9]+$/)                     # process all number elements
            a[j]=sprintf("%02d", a[j]) seps[j]; # zeropad
        b=b a[j]                                # gather buffer
    }
    return "\"" b "\""                          # return quoted
}
BEGIN { FS=OFS="," }
{
    for(i=2;i<=NF;i+=2)                         # loop the right ones
        $i=doit($i)                             # call the contractor
}
1

运行它:

$ awk -f program.awk file

输出:

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

答案 3 :(得分:0)

您也可以使用sed,将字边界之间的所有单个数字替换为0。但它会更改数据中的任何一位数,即使它不在日期列中。因此,如果您想要替换附加0的所有单个数字,请仅使用

sed 's|\b\([[:digit:]]\)\b|0\1|g'

如果您想要永久更改,请使用-i和sed。

  

工作原理。

正则表达式\b\([[:digit:]]\)\b将匹配使用(braces)捕获的字边界之间的单个数字。现在位于replace的{​​{1}}部分,使用第一个匹配模式sed0进行硬编码将为您\1填充一位数字。

  

正则表达式演示

要了解此正则表达式的工作原理,请参阅 regex demo

  

工作示例:

0