从python中的日志文件解析

时间:2017-07-12 17:57:06

标签: json regex python-3.x parsing logging

我有一个包含任意行数和json字符串的日志文件。我只需要从日志文件中提取一个json数据,但仅在GPD _____&_ 39; _____之后。我不希望文件中有任何其他行或json数据。

这是我输入文件的外观

INFO:modules.gp.helpers.parameter_getter:_____GP D_____
{'from_time': '2017-07-12 19:57', 'to_time': '2017-07-12 20:57', 'consig_number': 'dup1', 'text': 'r155', 'mobile': None, 'email': None}
ERROR:modules.common.actionexception:ActionError: [{'other': 'your request already crossed threshold time'}]
{'from_time': '2016-07-12 16:57', 'to_time': '2016-07-12 22:57', 'consig_number': 'dup2', 'text': 'r15', 'mobile': None, 'email': None}

如何在' _____ GP D _____'之后找到json字符串?

2 个答案:

答案 0 :(得分:0)

你可以逐行阅读你的文件,直到你在行尾遇到_____GP D_____,然后你只选择下一行:

found_json = None
with open("input.log", "r") as f:  # open your log file
    for line in f:  # read it line by line
        if line.rstrip()[-14:] == "_____GP D_____":  # if a line ends with our string...
            found_json = next(f).rstrip()  # grab the next line
            break  # stop reading of the file, nothing more of interest

然后你就可以用found_json做任何你想做的事情,包括解析,打印等等。

更新 - 如果您想要继续“关注”您的日志文件(类似于tail -f命令),您可以在读取模式下打开它并在阅读时保持文件句柄处于打开状态逐行添加读取之间的合理延迟(这主要是tail -f也是如此) - 然后您可以使用相同的过程来发现所需的行何时发生并捕获下一行要处理,发送给某些其他过程或做任何你打算用它做的事情。类似的东西:

import time

capture = False  # a flag to use to signal the capture of the next line
found_lines = []  # a list to store our found lines, just as an example
with open("input.log", "r") as f:  # open the file for reading...
    while True:  # loop indefinitely
        line = f.readline()  # grab a line from the file
        if line != '':  # if there is some content on the current line...
            if capture:  # capture the current line
                found_lines.append(line.rstrip())  # store the found line
                # instead, you can do whatever you want with the captured line
                # i.e. to print it: print("Found: {}".format(line.rstrip()))
                capture = False  # reset the capture flag
            elif line.rstrip()[-14:] == "_____GP D_____":  # if it ends in '_____GP D_____'..
                capture = True  # signal that the next line should be captured
        else:  # an empty buffer encountered, most probably EOF...
            time.sleep(1)  # ... let's wait for a second before attempting to read again...

答案 1 :(得分:0)

导入json     从AST导入literal_eval

enum EntityKind {
  Foo = 'foo',
  Bar = 'bar',
}

type Foo = {
  kind: EntityKind.Foo,
  foo: string
}

type Bar = {
  kind: EntityKind.Bar,
  bar: number
}

type Entity = (Foo | Bar) & {
  id: string,
  name: string
}

输出:

enum EntityKind {
  Foo = 'foo',
  Bar = 'bar',
  Baz = 'baz',
}

}