Question

所以我有几个日志文件，它们的结构如下：

Sep  9 12:42:15 apollo sshd[25203]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=189.26.255.11 

Sep  9 12:42:15 apollo sshd[25203]: pam_succeed_if(sshd:auth): error retrieving information about user ftpuser

Sep  9 12:42:17 apollo sshd[25203]: Failed password for invalid user ftpuser from 189.26.255.11 port 44061 ssh2

Sep  9 12:42:17 apollo sshd[25204]: Received disconnect from 189.26.255.11: 11: Bye Bye

Sep  9 19:12:46 apollo sshd[30349]: Did not receive identification string from 199.19.112.130

Sep 10 03:29:48 apollo unix_chkpwd[4549]: password check failed for user (root)

Sep 10 03:29:48 apollo sshd[4546]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=221.12.29.170  user=root

Sep 10 03:29:51 apollo sshd[4546]: Failed password for root from 221.12.29.170 port 56907 ssh2

还有更多日期和时间，但这只是一个例子。我想知道如何计算文件所涵盖的总时间。我尝试了一些事情，并且有大约5个小时没有成功。

我首先尝试了这个，但它很接近，但它没有像我想要的那样工作，它不断重复日期：

with open(filename, 'r') as file1:
        lines = file1.readlines()
        for line in lines:
            linelist = line.split()
            date2 = int(linelist[1])
            time2 = linelist[2]
            print linelist[0], linelist[1], linelist[2]
            if date1 == 0:
                date1 = date2
                dates.append(linelist[0] + ' ' + str(linelist[1]))
            if date1 < date2:
                date1 = date2
                ttimes.append(datetime.strptime(str(ltime1), FMT) - datetime.strptime(str(time1), FMT))
                time1 = '23:59:59'
                ltime1 = '00:00:00'
                dates.append(linelist[0] + ' ' + str(linelist[1]))
            if time2 < time1:
                time1 = time2
            if time2 > ltime1:
                ltime1 = time2

Answer 1

如果条目按时间顺序排列，您只需查看第一个条目和最后一个条目：

entries = lines.split("\n")

first_date = entries[0].split("apollo")[0]
last_date = entries[len(entries)-1].split("apollo")[0]

Answer 2

我们没有这一年，所以我选了今年。阅读所有行，转换月份索引，并解析每个日期。

然后对它进行排序（即使日志混合也能正常工作）并取得第一个＆amp;最后一项。。减去。享受。

x % sizeof(long)

结果：

from datetime import datetime

months = ["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]
current_year = datetime.now().year

dates = list()
with open(filename, 'r') as file1:
    for line in file1:
        linelist = line.split()
        if linelist:  # filter out possible empty lines
            linelist[0] = str(months.index(linelist[0]))  # convert 3-letter months to index
            date2 = int(linelist[1])
            z=datetime.strptime(" ".join(linelist[0:3])+" "+str(current_year),"%m %d %H:%M:%S %Y") # compose & parse the date
            dates.append(z)  # store in list

dates.sort()  # sort the list
first_date = dates[0]
last_date = dates[-1]

# print report & compute time span
print("start {}, end {}, time span {}".format(first_date,last_date,last_date-first_date))

请注意，由于缺少年份信息，它在12月31日到1月1日之间无法正常工作。我想如果我们找到1月份和1月份，我们可以猜测一下。日志中的12月假设它是从明年1月开始的。尚不支持。

如何计算日志文件在Python 2.7中的总时间？

2 个答案: