我正在使用linux服务,该服务在/ var / log中生成JSON格式的日志。日志文件几乎不断增加。实际上,我正在使用的服务没有任何数据库连接器或包装器来使日志直接进入数据库,因此我必须使用自己的服务进行解析和发送。
哪种方法可以不断解析文件并将新行上传到db?
添加:我不想使用与ELK堆栈相关的任何内容
谢谢!
答案 0 :(得分:2)
要读取文件,就像tail
命令一样,我做了一个小脚本:
<强> logtodb.py 强>
import json
import os
import time
def tail(stream_file):
""" Read a file like the Unix command `tail`. Code from https://stackoverflow.com/questions/44895527/reading-infinite-stream-tail """
stream_file.seek(0, os.SEEK_END) # Go to the end of file
while True:
if stream_file.closed:
raise StopIteration
line = stream_file.readline()
yield line
def log_to_db(log_path, db):
""" Read log (JSON format) and insert data in db """
with open(log_path, "r") as log_file:
for line in tail(log_file):
try:
log_data = json.loads(line)
except ValueError:
# Bad json format, maybe corrupted...
continue # Read next line
# Do what you want with data:
# db.execute("INSERT INTO ...", log_data["level"], ...)
print(log_data["message"])
测试文件:
<强> test_logtodb.py 强>
import random
import json
import time
import threading
import logtodata
def generate_test_json_log(log_path):
with open(log_path, "w") as log_file:
while True:
log_data = {
"level": "ERROR" if random.random() > 0.5 else "WARNING",
"message": "The program exit with the code '{0}'".format(str(int(random.random() * 200)))
}
log_file.write("{0}\n".format(
json.dumps(log_data, ensure_ascii=False)))
log_file.flush()
time.sleep(0.5) # Sleep 500 ms
if __name__ == "__main__":
log_path = "my-log.json"
generator = threading.Thread(
target=generate_test_json_log, args=(log_path,))
generator.start()
logtodata.log_to_db(log_path, db=None)
我假设日志文件如下:
{"level": "ERROR", "message": "The program exit with the code '181'"}
{"level": "WARNING", "message": "The program exit with the code '51'"}
{"level": "ERROR", "message": "The program exit with the code '69'"}
如果格式不正确,我可以帮助您更新我的脚本