我使用python打开大型日志文件,如
Thu Oct 4 23:14:40 2012 [pid 16901] CONNECT: Client "66.249.74.228"
Thu Oct 4 23:14:40 2012 [pid 16900] [ftp] OK LOGIN: Client "66.249.74.228", anon password "googlebot@google.com"
Thu Oct 4 23:17:42 2012 [pid 16902] [ftp] FAIL DOWNLOAD: Client "66.249.74.228", "/pub/10.5524/100001_101000/100039/Assembly-2011/Pa9a_assembly_config4.scafSeq.gz", 14811136 bytes, 79.99Kbyte/sec
Fri Oct 5 00:04:13 2012 [pid 25809] CONNECT: Client "66.249.74.228"
Fri Oct 5 00:04:14 2012 [pid 25808] [ftp] OK LOGIN: Client "66.249.74.228", anon password "googlebot@google.com"
Fri Oct 5 00:07:16 2012 [pid 25810] [ftp] FAIL DOWNLOAD: Client "66.249.74.228", "/pub/10.5524/100001_101000/100027/Raw_data/PHOlcpDABDWABPE/090715_I80_FC427DJAAXX_L8_PHOlcpDABDWABPE_1.fq.gz", 14811136 bytes, 79.99Kbyte/sec
Fri Oct 5 00:13:19 2012 [pid 27354] CONNECT: Client "1.202.186.53"
Fri Oct 5 00:13:19 2012 [pid 27353] [ftp] OK LOGIN: Client "1.202.186.53", anon password "mozilla@example.com"
我想从tail命令读取文件末尾的行以获取最近7天 记录。
这是我的代码,如何更改它。
import time
f= open("/opt/CLiMB/Storage1/log/vsftp.log")
def OnlyRecent(line):
if time.strptime(line.split("[")[0].strip(),"%a %b %d %H:%M:%S %Y")> time.gmtime(time.time()-(60*60*24*7)):
return True
return False
filename= time.strftime('%Y%m%d')+'.log'
f1= open(filename,'w')
for line in f:
if OnlyRecent(line):
print line
f1.write(line)
f.close()
f1.close()
答案 0 :(得分:3)
使用file.seek()跳转到文件末尾的某个偏移量。例如,要打印文件的最后1Kb而不读取文件的开头,请执行以下操作:
with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
f.seek(-1000, os.SEEK_END)
print f.read()
答案 1 :(得分:0)
我没有检查过这个,只是重新格式化代码:
dropwhile
而不是for
.. if
with
上下文打开/关闭文件-
from time import time, gmtime, strptime
from itertools import dropwhile
deadline = gmtime(time()-(60*60*24*7))
formatting = "%a %b %d %H:%M:%S %Y"
def not_recent(line):
return strptime(line.split("[")[0].strip(), formatting) <= deadline
with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
filename = time.strftime('%Y%m%d')+'.log'
with open(filename,'w') as f1:
for line in dropwhile(not_recent, f):
print line
f1.write(line)
答案 2 :(得分:0)
另一个实现,考虑到你正在处理庞大的日志文件
def tail(fname, n):
fin = os.open(fname,os.O_RDONLY ) #Get an open file desc
size = os.fstat(fin).st_size #Get the size from the stat
fin = os.fdopen(fin) #Convert fd to file obj
count = 0
fin.seek(size) #Seek to the end of the file
try:
while count < n: #Loop until the count of newlines exceed the tail size
pos = fin.tell() - 2 #Step backward
if pos == -1: #Until you are past the begining
raise StopIteration #When you end the Loop
fin.seek(pos)
if fin.read(1) == '\n': #And check if the next character is a new line
count += 1 #Maintaining the count
except StopIteration:
pass
return fin
用法
for e in tail("Test.log",10):
print e