我有一个包含多行的文件,想要提取每行的前三个单词。
str = []
str = [
Feb 17 07:10:07 afg-prod-web2 journal: afg-prod-web2 statistics: 192.168.28.12 - 200 - "{\x0A \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime-ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x22fd2dfcdc-ade2-11e6-8404-0242ac110003\x22,\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22\x0A },\x0A \x22stats\x22: [\x0A {\x0A \x22timestamp\x22: \x222017-02-16T23:29:57+0000\x22,\x0A \x22software_id\x22: \x22A-ACTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A }\x0A ]\x0A}"
Feb 17 07:10:07 afg-prod-web2 journal: afg-prod-web2 statistics: 192.168.28.12 - 200 - "{\x0A \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime-ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x22fd2dfcdc-ade2-11e6-8404-0242ac110003\x22,\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22\x0A },\x0A \x22stats\x22: [\x0A {\x0A \x22timestamp\x22: \x222017-02-16T23:29:57+0000\x22,\x0A \x22software_id\x22: \x22A-ACTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A }\x0A ]\x0A}"
Feb 17 07:10:07 afg-prod-web2 journal: afg-prod-web2 statistics: 192.168.28.12 - 200 - "{\x0A \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime-ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x22fd2dfcdc-ade2-11e6-8404-0242ac110003\x22,\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22\x0A },\x0A \x22stats\x22: [\x0A {\x0A \x22timestamp\x22: \x222017-02-16T23:29:57+0000\x22,\x0A \x22software_id\x22: \x22A-ACTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A }\x0A ]\x0A}"
Feb 17 07:10:07 afg-prod-web1 journal: afg-prod-web1 statistics: 192.168.28.12 - 200 - "{\x0A \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime-ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x22fd2dfcdc-ade2-11e6-8404-0242ac110003\x22,\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22\x0A },\x0A \x22stats\x22: [\x0A {\x0A \x22timestamp\x22: \x222017-02-16T23:29:57+0000\x22,\x0A \x22software_id\x22: \x22A-ACTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A }\x0A ]\x0A}"]
我想提取date
ie。每行Feb 17 07:10:07
并将其放入数组中。
我尝试应用for循环,但它出错:
IndexError: list index out of range
我试过的代码:
for i in splitdata:
abc = splitdata[logcount]
aa = abc.split()
if(aa[0] == "Feb"):
aaa = "".join([aa[0],' ',aa[1],' ',aa[2]])
logtime.append(aaa)
logcount += 2
else:
pass
print logtime
答案 0 :(得分:0)
如果您的日志保存在名为log.log的文件中,您可以通过执行以下操作来获取日期:
with open('log.log') as f:
log_time = []
for line in f:
log_time.append(line[:15])
print(log_time)
答案 1 :(得分:0)
您只需检查len(拆分字符串)以避免此类错误。改进代码有很多空间。
In [1]: sample_text = """Feb 17 07:10:07 afg-prod-web2 journal: afg-prod-web2 statistics: 192.168.28.12 - 200 - "{\x0A
...: \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime
...: -ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \
...: x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x22fd2dfcdc-ade2-11e6-8404-0242a
...: c110003\x22,\x0A \x22device_id\x22: \x222f504f5ed3c64934\x22\x0A },\x0A \x22stats\x22: [\x0A
...: {\x0A \x22timestamp\x22: \x222017-02-16T23:29:57+0000\x22,\x0A \x22software_id\x22: \x22A-A
...: CTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A
...: \x22device_id\x22: \x222f504f5ed3c64934\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A
...: }\x0A ]\x0A}"""
In [2]: def get_time_from_log(log_text):
...: log_text_split = log_text.split(" ")
...: if len(log_text_split) < 3:
...: pass
...: elif log_text_split[0] == "Feb":
...: return " ".join(log_text_split[0:3])
...:
In [3]: get_time_from_log(sample_text)
Out[3]: 'Feb 17 07:10:07'