这是一个日志文件的片段,该文件继续对数据库中的作业进行大量更新。我正在尝试找出每项工作需要多长时间,并找出异常值。
我想提供一个脚本,该脚本收集每个更新的第一个和最后一个进度更新(或sqffstatus)。
让RegEx轻松完成每项工作的每一行很容易,但是,当然,这让我都受益了,在许多情况下,我们每10秒获取一次更新的工作可能需要几个小时....由于许多工作相互交错,这一事实更加复杂。 目前,我能想到的最好的办法是手动编写一个脚本,以遍历每个可能的作业号并提取所有记录,然后为每个推送的更新选择第一个和最后一个。
在真实的日志中,我们从许多不同的IP地址获取更新,但是/sqff/[num]/progress
语法保持不变
必须有更好的方法
10.251.210.21 - - [14/Nov/2018:05:17:19 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/sqffstatus HTTP/1.1" 200 514
10.251.210.21 - - [14/Nov/2018:05:17:23 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:24 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/progress HTTP/1.1" 200 562
10.251.210.21 - - [14/Nov/2018:05:17:24 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/sqffstatus HTTP/1.1" 200 514
10.251.210.21 - - [14/Nov/2018:05:17:28 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:29 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/progress HTTP/1.1" 200 562
10.251.210.21 - - [14/Nov/2018:05:17:29 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/sqffstatus HTTP/1.1" 200 514
10.251.210.21 - - [14/Nov/2018:05:17:33 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:34 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/progress HTTP/1.1" 200 562
10.251.210.21 - - [14/Nov/2018:05:17:34 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/sqffstatus HTTP/1.1" 200 514
10.251.210.21 - - [14/Nov/2018:05:17:38 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:43 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:48 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:53 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:17:58 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:18:03 +0000] "POST //10.251.210.21:8080/fileflowqueue/ffq/sqff/acquire_next HTTP/1.1" 204 0
10.251.210.21 - - [14/Nov/2018:05:18:04 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/progress HTTP/1.1" 200 562
10.251.210.21 - - [14/Nov/2018:05:18:04 +0000] "PUT //10.251.210.21:8080/fileflowqueue/ffq/sqff/22/sqffstatus HTTP/1.1" 200 514
关于如何使用awk / grep做到这一点的任何想法?