作为“数据结构”课程的一部分,我的老师给了我额外的练习,这有点困难和挑战。 我试图找出用于解决此问题的数据结构,但我没有任何想法,我也想尝试自己编写代码以提高自己的python技能。
关于练习: 1.我有一个带有日志的文本文件,如下所示:
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
有2种日志,M是Master,S是Slave。 我需要一个数据结构,该数据结构将能够拆分每一行并将其抓取到特定的列中。 即M-1列将为:
M, 1, Datetime, Error Level, DeviceId, UserId, Message
但S-1列为:
S, 1, Datetime, Error Level, DeviceId, Action, Message
注意:如您所见,S,1中有Action,但没有UserId。
最后,我需要做的是在命令行中输入我要标准输出的列和条件(即错误级别> 50)。
我最想知道的是字典,但是这样一来,我将无法支持无限数量的版本(如果可能的话,请向我解释方式)。
谢谢!
答案 0 :(得分:1)
我可能会使用namedtuple
包中的collections
类来保存每个已解析的项目,因为它允许您通过索引号和名称来访问每个字段。此外,可以通过传递列名列表来轻松地动态创建新的namedtuple
类。
from collections import namedtuple
Master = namedtuple('Master', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'UserName', 'Message'])
Slave = namedtuple('Slave', ['Type', 'N', 'Datetime', 'ErrorLevel', 'DeviceId', 'Action', 'Message'])
n_cols = 7
logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in"""
master_list = []
slave_list = []
for r in logfileasstring.splitlines(False):
if not r:
continue
values = [value.strip() for value in r.split(',', n_cols - 1)]
if r[0] == 'M':
master_list.append(Master(*values))
else:
slave_list.append(Slave(*values))
print(master_list[0][6]) # by index
print(master_list[0].Message) # by column name if name known in advance
column_name = 'Message'
print(master_list[0].__getattribute__(column_name)) # by column name if name not known in advance
答案 1 :(得分:0)
此帮助:
logfileasstring = """
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system, and loading
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Open Connection”
M, 1, 14/08/2019 11:40, 100, xxxx, username, “Close Connection, and reboot”
S, 1, 14/08/2019 11:40, 6, xxxx, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, xxxx, User logged in, User username logged in
M, 1, 14/08/2019 11:39, 4, xxxx, username, “Initialization of the system”
S, 1, 14/08/2019 11:40, 6, New User, We created the user in the systems
S, 1, 14/08/2019 11:41, 3, User logged in, User username logged in"""
listoflist = [[v.strip() for v in r.split(",", maxsplit=6)]
for r in logfileasstring.splitlines(keepends=False)
if r]
grouped = {("M", "1"): [], ("S", "1"): []}
for row in listoflist:
datasets_for = grouped[row[0], row[1]]
datasets_for.append(row[2:])
# must be set by script
fields = [0, 1, 2]
for k in grouped:
print(k, "::")
for row in grouped[k]:
print(" -", [row[f] for f in fields])