我打算在kubernetes通过gunicorn运行烧瓶。为了正确记录日志,我想在json中输出所有日志。
目前我正在使用minikube和https://github.com/inovex/kubernetes-logging进行测试,以便流利地收集日志。
由于以下原因,我设法将错误日志(追溯)正确格式化: JSON formatted logging with Flask and gunicorn
我仍在努力使用访问日志格式。 我指定了以下gunicorn访问日志格式:
access_log_format = '{"remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}'
结果日志是json格式化的。但是消息部分(基于access_log_format的格式)现在包含转义双引号,并且不会被流利/ ELK解析为它的各个字段
{"tags": [], "timestamp": "2017-12-07T11:50:20.362559Z", "level": "INFO", "host": "ubuntu", "path": "/usr/local/lib/python2.7/dist-packages/gunicorn/glogging.py", "message": "{\"remote_ip\":\"127.0.0.1\",\"request_id\":\"-\",\"response_code\":\"200\",\"request_method\":\"GET\",\"request_path\":\"/v1/records\",\"request_querystring\":\"\",\"request_timetaken\":\"19040\",\"response_length\":\"20\"}", "logger": "gunicorn.access"}
由于 JPW
答案 0 :(得分:1)
最简单的解决方案是将外部单引号更改为双引号,将内部双引号更改为单引号,如下所述。
--access-logformat "{'remote_ip':'%(h)s','request_id':'%({X-Request-Id}i)s','response_code':'%(s)s','request_method':'%(m)s','request_path':'%(U)s','request_querystring':'%(q)s','request_timetaken':'%(D)s','response_length':'%(B)s'}"
以下是示例日志
{'remote_ip':'127.0.0.1','request_id':'-','response_code':'404','request_method':'GET','request_path':'/test','request_querystring':'','request_timetaken':'6642','response_length':'233'}
{'remote_ip':'127.0.0.1','request_id':'-','response_code':'200','request_method':'GET','request_path':'/','request_querystring':'','request_timetaken':'881','response_length':'20'}
答案 1 :(得分:0)
您可以直接在\"
的值中转义双引号(--access-logformat
),以将日志保留为有效JSON。
因此,如果您在Docker容器中运行Gunicorn,则 Dockerfile 可能会以类似以下内容的结尾:
CMD ["gunicorn", \
"-b", "0.0.0.0:5000", \
"--access-logfile", "-",\
"--access-logformat", "{\"remote_ip\":\"%(h)s\",\"request_id\":\"%({X-Request-Id}i)s\",\"response_code\":\"%(s)s\",\"request_method\":\"%(m)s\",\"request_path\":\"%(U)s\",\"request_querystring\":\"%(q)s\",\"request_timetaken\":\"%(D)s\",\"response_length\":\"%(B)s\"}", \
"app:create_app()"]
找到其余的Gunicorn日志记录选项here。
答案 2 :(得分:0)
已经2年了,我假设流利的python记录器已经改变,我现在遇到的问题略有不同,每个Google搜索都指向该讨论。
在gunicorn配置文件中使用示例时
access_log_format = '{"remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}'
我得到了将其读取为json并将其与流利的json数据合并的期望行为,但是未填充gunicorn字段
{"tags": [], "level": "INFO", "host": "ubuntu", "logger": "gunicorn.access", "remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}
这似乎是因为Gunicorn将access_log_format
作为消息传递给记录器,并将所有参数(safe_atoms
)作为附加参数传递给了
safe_atoms = self.atoms_wrapper_class(
self.atoms(resp, req, environ, request_time)
)
try:
# safe_atoms = {"s": "200", "m": "GET", ...}
self.access_log.info(self.cfg.access_log_format, safe_atoms)
但是,如果FluentRecordFormatter
将该字符串视为有效的json,它将使用json.loads
读取该字符串,但是会忽略传递的所有参数
def _format_msg_json(self, record, msg):
try:
json_msg = json.loads(str(msg)) # <------- doesn't merge params
if isinstance(json_msg, dict):
return json_msg
else:
return self._format_msg_default(record, msg)
except ValueError:
return self._format_msg_default(record, msg)
将此与default Python formatter进行比较,后者调用record.message = record.getMessage()
,后者依次合并其中的参数
def getMessage(self):
"""
Return the message for this LogRecord.
Return the message for this LogRecord after merging any user-supplied
arguments with the message.
"""
msg = str(self.msg)
if self.args:
msg = msg % self.args # <------ args get merged in
return msg
我已经logged an issue使用了fluent-logger-python项目。
使用logging filter执行合并,然后再将其传递给FluentRecordFormatter
。
logger = logging.getLogger('fluent.test')
class ContextFilter(logging.Filter):
def filter(self, record):
record.msg = record.msg % record.args
return True
fluent_handler = handler.FluentHandler('app.follow', host='localhost', port=24224)
formatter = handler.FluentRecordFormatter()
fluent_handler.setFormatter(formatter)
merge_filter = ContextFilter()
fluent_handler.addFilter(merge_filter)
logger.addHandler(fluent_handler)
使用日志记录筛选器的解决方法后,我开始收到类似错误
ValueError: unsupported format character ';' (0x3b) at index 166
事实证明,FluentRecordFormatter
确实调用了基本的getMessage
实现,将参数合并到消息中
def format(self, record):
# Compute attributes handled by parent class.
super(FluentRecordFormatter, self).format(record) # <------ record.messge = record.msg % record.args
# Add ours
record.hostname = self.hostname
# Apply format
data = self._formatter(record)
self._structuring(data, record)
return data
问题在于_format_msg_json(self, record, msg)
使用record.msg
属性,它是未合并数据,而record.message
是合并数据。这就产生了一个问题,我的日志记录过滤器正在合并/格式化数据,但是随后日志格式化程序也试图这样做并且偶尔会看到无效的语法。
我已经完全放弃了从gunicorn / python日志记录输出json。相反,我使用Fluentd的解析器来解析json,例如
<filter *.gunicorn.access>
@type parser
key_name message
reserve_time true
reserve_data true
remove_key_name_field true
hash_value_field access_log
<parse>
@type regexp
expression /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"$/
time_format %d/%b/%Y:%H:%M:%S %z
</parse>
</filter>
您可以在此处了解有关选项的操作:https://docs.fluentd.org/filter/parser