我使用下面的代码使用pyparsing解析了python中的服务器日志,并引发了异常。语法似乎正确,因为它适用于一行日志,但是为什么会看到此异常?感谢您的指导或指导!
#!/bin/python
# import required modules
# (include the ones used later after defining grammar)
import string
from pyparsing import alphas, nums, Combine, Word, Group,
delimitedList, Suppress, removeQuotes, alphanums
test_data = """
Oct 31 06:26:51 os-test-rb dhclient[844]: DHCPACK of 192.168.14.6
from 192.168.14.2
"""
# define a function with the grammar
logLine = None
def getLog():
global logLine
if logLine is None:
serverDateTime = Combine(Word(alphas) + Word(nums) +
Word(nums) + ":" + Word(nums) + ":" + Word(nums))
userName = Word(alphas+'-')
clientName = Combine(Word(alphas) +"[" + Word(nums) + "]" + ":")
message = Word(alphanums) + Word(alphas) + delimitedList( Word(nums), ".", combine=True ) + Word(alphas) + delimitedList( Word(nums), ".", combine=True )
logLine = ( serverDateTime.setResultsName("timestamp") +
userName.setResultsName("username") +
clientName.setResultsName("client") +
message.setResultsName("Message from Server"))
return logLine
# print out the log
for line in test_data:
if not line: continue
data = getLog().parseString(line)
print(data.dump())
print(data.asXML("LOG"))
引发的异常是:
Traceback (most recent call last):
File "server_log_parser1.py", line 63, in <module>
data = getLog().parseString(line)
raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected W:(ABCD...) (at char 1), (line:2, col:1)
答案 0 :(得分:0)
我的第一个猜测是,您正在解析仅包含空格或可能仅包含尾随换行符的行。您的if line: continue
过滤器将无法捕获到此消息,因此您将一个基本为空的字符串传递给pyparsing,然后它抱怨没有前置日期时间(实际上没有由alpha单词组成的前置月份字符串)。将此行更改为:
if not line.strip(): continue
此外,我鼓励您放弃使用asXML()
来支持dump()
。 asXML()
进行了一些我从未感到满意的猜测,而且我通常不喜欢该界面,因此已弃用该界面,它将在下一个次要版本中删除。 dump()
更好地列出未命名令牌和已命名令牌以及列表。另外,使用expr.runTests()
可以更好地诊断解析器在哪里误入歧途。