我正在使用Logfiles。我需要的是我想在一段指定的时间内逐行读取文件,比如说10秒。如果有办法在Python中实现这一点,有人可以帮助我吗?
答案 0 :(得分:1)
使用tail
运行tac
或Popen
并迭代输出,直到找到要停止的行。这是一个示例代码段。
filename = '/var/log/nginx/access.log'
# Command to read file from the end
cmd = sys.platform == 'darwin' and ['tail', '-r', filename] or ['tac', filename]
# But if you want read it from beginning, use the following
#cmd = ['cat', filename]
proc = Popen(cmd, close_fds=True, stdout=PIPE, stderr=PIPE)
output = proc.stdout
FORMAT = [
# 'foo',
# 'bar',
]
def extract_log_data(line):
'''Extact data in you log format, normalize it.
'''
return dict(zip(FORMAT, line))
csv.register_dialect('nginx', delimiter=' ', quoting=csv.QUOTE_MINIMAL)
lines = csv.reader(output, dialect='nginx')
started_at = dt.datetime.utcnow()
for line in lines:
data = extract_log_data(line)
print data
if (dt.datetime.utcnow() - started_at) >= dt.timedelta(seconds=10):
break
output.close()
proc.terminate()
答案 1 :(得分:1)
from multiprocessing import Process
import time
def read_file(path):
try:
# open file for writing
f = open(path, "r")
try:
for line in f:
# do something
pass
# always close the file when leaving the try block
finally:
f.close()
except IOError:
print "Failed to open/read from file '%s'" % (path)
def read_file_limited_time(path, max_seconds):
# init Process
p = Process(target=read_file, args=(path,))
# start process
p.start()
# for max seconds
for i in range(max_seconds):
# sleep for 1 seconds (you may change the sleep time to suit your needs)
time.sleep(1)
# if process is not alive, we can break the loop
if not p.is_alive():
break
# if process is still alive after max_seconds, kiil it!
if p.is_alive():
p.terminate()
def main():
path = "f1.txt"
read_file_limited_time(path,10)
if __name__ == "__main__":
main()
我们每隔1秒钟“醒来”并检查我们开始的过程是否仍然存在的原因只是为了防止我们在过程结束时保持睡眠状态。如果过程在1秒后结束,则浪费时间浪费9秒钟。