我有多列格式如下
D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"
我想将它们格式化如下
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
我尝试使用awk -F“[,/:]”分离列,然后根据长度进行处理
但是当有多列时,它会变得单调乏味。
请在awk中建议是否有任何日期时间或时间戳格式化选项,以便我可以按列快速处理,这将是快速的
答案 0 :(得分:4)
$ cat tst.awk
function fmt(t, f) {
split(t,f,/["\/ :]/)
return sprintf("\"%02d/%02d/%04d %02d:%02d:%02d %s\"",f[2],f[3],f[4],f[5],f[6],f[7],f[8])
}
BEGIN { FS=OFS="," }
{ $2=fmt($2); $4=fmt($4); print }
$ awk -f tst.awk file
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
答案 1 :(得分:3)
我建议使用awk
及其printf
格式化输出:
awk -F '["/ :]' '{printf "%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16}' file
输出:
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM" D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM" D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM" D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM" D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
答案 2 :(得分:0)
使用GNU awk(split
和seps
)。代码:
function doit(str, b) { # b is a local var buffer
gsub(/\"/,"",str); # remove quotes
n=split(str,a,"[/ :]",seps); # split on special chars
for(j=1;j<=n;j++) { # loop all elements in a
if(a[j]~/^[0-9]+$/) # process all number elements
a[j]=sprintf("%02d", a[j]) seps[j]; # zeropad
b=b a[j] # gather buffer
}
return "\"" b "\"" # return quoted
}
BEGIN { FS=OFS="," }
{
for(i=2;i<=NF;i+=2) # loop the right ones
$i=doit($i) # call the contractor
}
1
运行它:
$ awk -f program.awk file
输出:
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
答案 3 :(得分:0)
您也可以使用sed
,将字边界之间的所有单个数字替换为0
。但它会更改数据中的任何一位数,即使它不在日期列中。因此,如果您想要替换附加0
的所有单个数字,请仅使用
sed 's|\b\([[:digit:]]\)\b|0\1|g'
如果您想要永久更改,请使用-i
和sed。
工作原理。
正则表达式\b\([[:digit:]]\)\b
将匹配使用(braces)
捕获的字边界之间的单个数字。现在位于replace
的{{1}}部分,使用第一个匹配模式sed
对0
进行硬编码将为您\1
填充一位数字。
正则表达式演示
要了解此正则表达式的工作原理,请参阅 regex demo
工作示例:
0