我有这个日志文件样本,我需要计算过去一个月,三个月和一年的条目。以下是日志文件的几行
10/14/2015 10:04:25 AM Following file:<open file 'dirs/tmp/bundle_21241.dat.json', mode 'r' at 0x8b73498> has invalid json which is ignored
11/15/2015 10:42:53 PM Following file:<open file 'dirs/tmp/bundle_21241.dat.json', mode 'r' at 0xa314498> has invalid json which is ignored
10/21/2015 10:16:42 AM Following hmac:94e301ff67773de56194165451535ba223cd27588221363290fbfcb96d9d0539 with is already in database so dropping
11/21/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:11:47+0000+ 12.61 0.430 1686.00
10/21/2015 10:16:42 AM Following hmac:c35330404902c0b1bb5c6d0718407ea12b25a464433bd1e69152ccc0e0b89c9f with is already in database so dropping
10/17/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:30:21+0000+ 12.61 0.010 1686.00
10/11/2015 10:16:42 AM Following hmac:8df71a9f6b6f0a0adb48c052767045f37ec34fce9c002a1c0c5ebc38ed500bf8 with is already in database so dropping
10/15/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:45:40+0000+ 12.61 0.018 1686.00
12/21/2015 10:16:42 AM Following hmac:fda9f5756461a8bc2922c55e75a31cf4915e6b0d016ecb786666624a0f04a02f with is already in database so dropping
12/10/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 20:01:01+0000+ 12.60 0.048 1686.00
07/21/2015 10:16:42 AM Following hmac:84d9cdb2145b7c3e0fa2d099070b7bd291c652f30ca25c69240e33ebbd2b8677 with is already in database so dropping
这是我的代码
from datetime import date
from datetime import time
from datetime import datetime
from datetime import timedelta
import os
def fileCount(fileName):
with open(fileName) as FileObj:
Count = 0
today_date = date.today()
One_Year = str(today_date - timedelta(days=365))
One_Month = str(today_date - timedelta(days=30))
Three_Months = str(today_date - timedelta(days=90))
while True:
line = FileObj.readline()
record_date = ('-'.join(line[:10].split('/'))).split(" ")
if not line:
break
if "Following hmac" in line:
try:
convert_date = datetime.strptime(record_date[0], '%m-%d-%Y')
#print "Difference is ", todayDate - convert_date.date()
#print convert_date.date()
date_diff = str(today_date - convert_date.date())
#print dateDiff[:8]
if date_diff[:8] < One_Month:
Count += 1
#print "Last 30 Days Failed HMAC is ", Count
else:
continue
#print convert_date.date()
except ValueError:
print 'This line has a problem:', record_date
print "The Total Number of Failed HMAC is ", Count
# Call The function
def main():
filePath = 'file.txt'
fileCount(filePath)
if __name__ == "__main__":
main()
我是编程新手,不太了解日期算术。目前我得到了答案,但他们似乎没有返回正确的值。目标是遍历每一行并计算最后30,60和365天间隔的行数。我的代码目前包含过去30天的测试,但我收到了错误的值。
答案 0 :(得分:1)
您需要将所有内容转换为日期时间对象才能比较项目。通过在列表中定义它们并使用Python Counter()
来相应地计算它们,处理所有不同的范围也会更容易。这样可以更容易地扩展范围。
from datetime import datetime, timedelta
from collections import Counter
def fileCount(fileName):
log_entry_counts = Counter()
today = datetime.today()
date_ranges = [
('three months', today - timedelta(days=90)),
('month', today - timedelta(days=30)),
('year', today - timedelta(days=365))]
with open(fileName) as f_input:
for line in f_input:
if "Following hmac" in line:
log_date = datetime.strptime(line[:10], '%m/%d/%Y')
for text, dr in date_ranges:
if log_date >= dr:
log_entry_counts[text] += 1
total = 0
for text, count in log_entry_counts.items():
print "Failed HMAC in the last {}: {}".format(text, count)
total += count
print "Total failed HMAC:", total
fileCount('input.txt')
这将使您的输出看起来像:
Failed HMAC in the last three months: 1
Failed HMAC in the last month: 1
Failed HMAC in the last year: 2
Total failed HMAC: 4