我有以下awk命令并输出以将DAT文件转换为具有特定FS的CSV,但我希望将包含日期时间值的字段拆分为两个单独的字段,并在时间格式中添加:00秒。
awk命令:
awk 'BEGIN{FS="\024"; OFS = ","; ORS = "\n"} {gsub(/\376/, "\"", $0); print $1, $2, $3, $4, $5}' input.dat > output.csv
输入
þNUMþþDATE CREATEDþþDATE SENTþþDATE MODIFIEDþþDATE RECEIVEDþ
þNUM00000001þþþþ9/11/2017 12:00 AMþþ6/16/2018 12:00 AMþþþ
þNUM00000002þþþþ5/2/2016 12:00 AMþþ6/16/2018 12:00 AMþþþ
输出:
"NUM","DATE CREATED","DATE SENT","DATE MODIFIED","DATE RECEIVED"
"NUM00000001","","9/11/2017 12:00 AM","6/16/2018 12:00 AM",""
"NUM00000002","","5/2/2016 12:00 AM","6/16/2018 12:00 AM",""
所需的输出:
"NUM","DATE CREATED","CREATED TIME","DATE SENT","SENT TIME","DATE MODIFIED","MOD TIME","DATE RECEIVED","RECEIVED TIME"
"NUM00000001","","","9/11/2017","12:00:00 AM","6/16/2018","12:00:00 AM","",""
"NUM00000002","","","5/2/2016","12:00:00 AM","6/16/2018","12:00:00 AM","",""
是否可以在每个字段中添加代码以执行拆分?请注意,某些行/行的日期/时间可以为NULL。
答案 0 :(得分:0)
基于示例数据,datetime需要用第一个空格分隔为日期和时间。您可以使用awk函数。例如:
awk '
# Get Date
function get_d (v) {
sep = index(v, " ")
return substr(v, 1, sep-1) "\"" ;
}
# Get Time
function get_t (v) {
sep = index(v, " ")
if ( !sep ) return ""
# insert :00 to time.
tt= substr(v, sep+1, 5) ":00" substr(v, sep+6)
# Remove leading zero from hour.
sub("^0", "", tt)
return "\"" substr(v, sep+1, 5) ":00" substr(v, sep+6)
# return "\"" substr(v, sep+1, 99) ;
}
BEGIN {FS="\024"; OFS = ","; ORS = "\n"}
{gsub(/\376/, "\"", $0);
print $1, get_d($9), get_t($9), get_d($10), get_t($10), get_d($11), get_t($11), get_d($12), get_t($12)}
' input.dat > output.csv