使用gawk格式化日期字符串?

时间:2017-06-07 10:13:23

标签: bash awk gawk

运行此代码时出现问题:

gawk 'BEGIN{FS=";";RS="\r\n"}
        {
            for (i = 1; i <= NF; i++) {
                if(match($i, /([0-9]{4})-([0-9]{2})-([0-9]{2})-([0-9]{2})\.([0-9]{2})\.([0-9]{2})\.([0-9]{6})/, m)){
                    $i =  m[1]"-"m[2]"-"m[3]" " m[4]":"m[5]":"m[6]
                    printf $0 "\n"
                }

            }
        }' contact20.txt > cleaned.txt

输入:

3;0952;2001-03-22-11.56.13.514119;2;2014-09-21-10.25.58.918626;J;2015-12-27-14.17.45.593190;N;0;0001-01-01-00.00.00.000000;N;2014-09-21-10.25.58.918626;2012-11-03-21.52.55.270989;N;0001-01-01-00.00.00.000000

我明白了:

3 0952 2001-03-22 11:56:13 2 2014-09-21-10.25.58.918626 J 2015-12-27-14.17.45.593190 N 0 0001-01-01-00.00.00.000000 N 2014-09-21-10.25.58.918626 2012-11-03-21.52.55.270989 N 0001-01-01-00.00.00.000000

但结果应如下所示:

3;0952;2001-03-22 11:56:13;2;2014-09-21 10:25:58;J;2015-12-27 14:17:45;N;0;0001-01-01 00:00:00;N;2014-09-21 10:25:58;2012-11-03 21:52:55;N;0001-01-01 00:00:00

我无法弄清楚为什么从字符串中删除;并忽略日期字符串,例如0001-01-01-00.00.00.000000,匹配只匹配第一个?

我需要更改什么来制作工作属性?

2 个答案:

答案 0 :(得分:1)

您当前的方法将为循环中的每个字段输出/重复相同的行。
要将所需结果作为带有转换后的“ date ”值的行获得,请使用以下内容:

awk 'BEGIN{ FS=OFS=";" }
     {  for (i = 1; i <= NF; i++) {
            if(match($i, /([0-9]{4})-([0-9]{2})-([0-9]{2})-([0-9]{2})\.([0-9]{2})\.([0-9]{2})\.([0-9]{6})/, m)){
                $i =  m[1]"-"m[2]"-"m[3]" " m[4]":"m[5]":"m[6]                                      
            }
        }
     }1' contact20.txt > cleaned.txt
cat cleaned.txt
3;0952;2001-03-22 11:56:13;2;2014-09-21 10:25:58;J;2015-12-27 14:17:45;N;0;0001-01-01 00:00:00;N;2014-09-21 10:25:58;2012-11-03 21:52:55;N;0001-01-01 00:00:00

答案 1 :(得分:1)

你不需要循环,你只需要:

$ gawk '{print gensub(/([0-9]{4})-([0-9]{2})-([0-9]{2})-([0-9]{2})\.([0-9]{2})\.([0-9]{2})\.([0-9]{6})/,"\\1-\\2-\\3 \\4:\\5:\\6","g")}' file
3;0952;2001-03-22 11:56:13;2;2014-09-21 10:25:58;J;2015-12-27 14:17:45;N;0;0001-01-01 00:00:00;N;2014-09-21 10:25:58;2012-11-03 21:52:55;N;0001-01-01 00:00:00

当然可以用sed轻松完成:

$ sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})-([0-9]{2})\.([0-9]{2})\.([0-9]{2})\.([0-9]{6})/\1-\2-\3 \4:\5:\6/g' file
3;0952;2001-03-22 11:56:13;2;2014-09-21 10:25:58;J;2015-12-27 14:17:45;N;0;0001-01-01 00:00:00;N;2014-09-21 10:25:58;2012-11-03 21:52:55;N;0001-01-01 00:00:00

以上使用GNU awk for gensub()和GNU或OSX sed for -E。