我有一个文件:file.txt
,其中包含数据:
24/9/2018 15:35:19.380 B63201C<br>
24/9/2018 15:35:22.350 ES0101C(hour_start)<br>
24/9/2018 15:36:13.231 Execute service next : 0003<br>
24/9/2018 15:38:13.664 Result of the execution 0003 Result: 0003<br>
24/9/2018 15:39:10.664 Executing the transaction PE20<br>
24/9/2018 15:35:26.773 ES0101C(hour_end)<br>
24/9/2018 15:36:12.164 B63201C<br>
- 1 bloque -<br>
24/9/2018 17:16:17.428 B63201C<br>
24/9/2018 17:16:29.031 ES0101C(hour_start)<br>
24/9/2018 17:16:13.231 Execute service next : 0003<br>
24/9/2018 17:18:13.664 Result of the execution 0003 Result: 0003<br>
24/9/2018 17:19:10.664 Executing the transaction BE15<br>
24/9/2018 17:25:26.773 ES0101C(hour_end)<br>
24/9/2018 17:26:12.164 B63201C<br>
- 2 bloque -<br>
我需要使用以下字段提取CSV格式的数据:
日期,小时开始,小时结束,B63201C-ES0101C,事务
换句话说,捕获的数据将是:
> 24/9/2018,15:35:22.350,15:35:26.773,B63201C-ES0101C,PE20
> 24/9/2018,17:16:29.031,17:25:26.773,B63201C-ES0101C,BE15
有什么方法可以在Bash或AwK中实现?
答案 0 :(得分:0)
#!awk -f
/Executing the transaction / {
sub("<br>","")
transaction=$NF
}
/\(hour_start\)/ {
date=$1
hour_start=$2
id1=prev
}
/\(hour_end\)/ {
hour_end=$2
split($3,a,"(")
id2=a[1]
printf "%s,%s,%s,%s-%s,%s\n", date, hour_start, hour_end, id1, id2, transaction
}
{
sub("<br>","")
prev=$3
}
示例代码在名为script
的可执行文件中,示例输入input
:
$ ./script input
24/9/2018,15:35:22.350,15:35:26.773,B63201C-ES0101C,PE20
24/9/2018,17:16:29.031,17:25:26.773,B63201C-ES0101C,BE15