创建脚本来分析日志文件

时间:2020-06-12 16:16:57

标签: python bash parsing logging socket.io

我正在尝试分析由socketio创建的日志文件:

2020-06-12 14:40 +02:00: * 2020-06-12T12:40:44.728Z +   connect viewer: xxxxxx room: e4c60 viewers actuel [ e4c60: 370, '44c0d': 1 ] socket.id: /viewers#qnm6nJtDSSVA2N-oAAO0
...
2020-06-12 15:51 +02:00: * 2020-06-12T13:51:39.889Z - disconnect viewer: xxxxxx room: e4c60 viewers actuel [ e4c60: 26, e3fa1: 3, '44c0d': 1 ] socket.id: /viewers#qnm6nJtDSSVA2N-oAAO0
...
2020-06-12 15:51 +02:00: * 2020-06-12T13:51:46.978Z +   connect viewer: vvvvvvv room: e4c60 viewers actuel [ e4c60: 27, e3fa1: 3, '44c0d': 1 ] socket.id: /viewers#w03eaaUVq6mL2SzPAAS1
...
2020-06-12 15:58 +02:00: * 2020-06-12T13:58:01.377Z - disconnect viewer: vvvvvvv room: e4c60 viewers actuel [ e4c60: 23, e3fa1: 3, '44c0d': 1 ] socket.id: /viewers#w03eaaUVq6mL2SzPAAS1

我想要的是使套接字正常运行,并最终获得平均水平。

因此,我需要捕获:date +“ socket.id”值,然后从具有日期的同一个socket.id断开连接...最后,以秒为单位在两个日期之间进行区别;房间名称也很重要,因为我最后需要每个房间的结果

我需要对日志文件中的每个条目执行此操作,最后取所有“差异”以秒为单位获取平均值

如果您有一个想法如何轻松使用任何一种语言(bash,python ...) 谢谢

1 个答案:

答案 0 :(得分:0)

此python代码应该可以使用。

from datetime import datetime

status_dict = {}
diff_list = []

with open('test_data.txt', 'r') as infile:
    for line in infile:
        date = datetime.strptime(line.split(' ')[4], '%Y-%m-%dT%H:%M:%S.%fZ')
        socket_id = line.split('#')[1].strip()
        if 'connect' in line:
            status = 'connect'
        else:
            status = 'disconnect'
        if socket_id in status_dict:
            diff = (date - status_dict[socket_id]['date']).total_seconds()
            diff_list.append(diff)
            print(f'Socket ID: {socket_id} closed after {diff} seconds.')
        else:
            status_dict[socket_id] = {'date':date, 'status':status}

print(f'Average connected time: {sum(diff_list)/len(diff_list)} seconds.')

您提供的数据的输出如下所示:

Socket ID: qnm6nJtDSSVA2N-oAAO0 closed after 4255.161 seconds.
Socket ID: w03eaaUVq6mL2SzPAAS1 closed after 374.399 seconds.
Average connected time: 2314.78 seconds.

我敢肯定有很多方法可以做得更好,但是如果数据与您发布的示例一致,这应该对您有用。