使用pygrok进行高效解析

时间:2018-06-19 17:38:11

标签: python concurrent.futures

是否有一种有效的方法来使用pygrok针对多个grok模式解析给定的日志行?

这是我正在测试的代码。

from concurrent.futures import ProcessPoolExecutor, as_completed
from pygrok import Grok

patterns = [
    '^%{COMMONAPACHELOG}$',
    '^%{COMBINEDAPACHELOG}$',
    '^%{HTTPD_ERRORLOG}$',
]

text = '37.162.60.195 - - [06/Jun/2018:17:31:29 -0400] "PUT /app/main/posts HTTP/1.0" 200 5055 "http://harris-johnson.com/main/register/" "Mozilla/5.0 (Windows NT 5.0) AppleWebKit/5311 (KHTML, like Gecko) Chrome/13.0.850.0 Safari/5311"'
# text= '::1 - - [26/Dec/2016:16:16:29 +0200] "GET /favicon.ico HTTP/1.1" 404 209'
# text = '[Mon Dec 26 16:22:08 2016] [error] [client 192.168.33.1] File does not exist: /var/www/favicon.ico'

def grok_parser(pattern, text):
    response = Grok(pattern).match(text)
    if response:
       return response

def parse_type(ia):
    with ProcessPoolExecutor(max_workers=len(patterns)) as executor:
        future_results = {executor.submit(grok_parser, pattern, ia) for pattern in patterns}
        for future in as_completed(future_results):
           return future.result()
print(parse_type(text))

0 个答案:

没有答案