我的python相当生疏,想知道是否有更好的方法或更有效的方式来编写这个脚本。
脚本的目的是获取txt日志并将''替换为','替换'。'来创建.csv ..使日志更容易阅读。
任何建议或意见将不胜感激。 谢谢。
import sys
import os
import datetime
t = datetime.datetime.now() ## time set to UTC zone
ts = t.strftime("%Y_%m_%d_ %H_%M_%S") # Date format
if len(sys.argv) != 2: # if CLi does not equal to 2 commands print
print ("usage:progammename.py logname.ext")
sys.exit(1)
logSys = sys.argv[1]
newLogSys = sys.argv[1] + "_" + ts +".csv"
log = open(logSys,"r")
nL = file(newLogSys ,"w")
# Read from file log and write to nLog file
for lineI in log.readlines():
rec=lineI.rstrip()
if rec.startswith("#"):
lineI=rec.replace(':',',').strip()
nL.write(lineI + "\n")
else:
lineO=rec.replace(' ',',').strip() #
nL.write(lineO + "\n")
## closes both open files; End script
nL.close()
log.close()
=====Sample log========
#Date: 2008-04-18 15:41:16
#Fields: date time time-taken c-ip cs-username cs-auth-group x-exception-id sc-filter-result cs-categories cs(Referer) sc-status s-action cs-method rs(Content-Type) cs-uri-scheme cs-host cs-uri-port cs-uri-path cs-uri-query cs-uri-extension cs(User-Agent) s-ip sc-bytes cs-bytes x-virus-id
2012-02-02 16:19:01 14 xxx.xxx.xxx.xxx user domain\group dns_unresolved_hostname DENIED "Games" - 404 TCP_ERR_MISS POST - http updaterservice.wildtangent.com 80 /appupdate/appcheckin.wss - wss "Mozilla/4.0 (compatible; MSIE 8.0; Win32)" xxx.xxx.xxx.xxx 824 697 -
答案 0 :(得分:3)
readlines
进行迭代。只需for lineI in log
将迭代所有行,但不会将整个文件读入内存。rstrip
取消换行,但随后将其重新添加。strip
的目的不明确,尤其是当你已经用逗号替换所有空格时。答案 1 :(得分:2)
我会将您的代码缩短为:
import sys
import os
from time import strftime
if len(sys.argv) != 2: # if CLi does not equal to 2 commands print
print ("usage:progammename.py logname.ext")
sys.exit(1)
logSys = sys.argv[1]
newLogSys = "%s_%s.csv" % (logSys,strftime("%Y_%m_%d_ %H_%M_%S"))
with open(logSys,'rb') as log, open(newLogSys,'wb') as nL:
nL.writelines(lineI.replace(':' if lineI[0]=='#' else ' ', ',')
for lineI in log)
我仍然不明白你的意思是添加另一行,即'\ n',而不是那些以'#'开头的行
我使用您的示例运行以下代码,但我没有观察到您所描述的内容。对不起,但我不能为我没有察觉的问题提出任何解决办法。
from time import strftime
import re
ss = ('--|| ||:|||:||--||| \r\n'
'#10 23:30 abcdef : \r\n'
'802 12:25 xyz : \r\n'
'\r\n'
'#:35 11:18+14:39 sunny vale : sunny sea\r\n'
' 651454451 drh:hdb 54:1\r\n'
' \r\n'
': 541514 oi:npvert654165:8\r\n'
'#5415:v541564zervt\r\n'
'# :: \r\n'
'#::: :::\r\n'
' E\r\n')
regx = re.compile('(\r?\n(?!$))|(\r?\n$)')
def smartdispl(com,smth,regx = regx):
print '\n%s\n%s\n%s' %\
('{0:{fill}{align}70}'.format(' %s ' % com,fill='=',align='^'),
'\n'.join(repr(el) for el in smth.splitlines(1)),
'{0:{fill}{align}70}'.format('',fill='=',align='^'))
logSys = 'poiu.txt'
with open(logSys,'wb') as f:
f.write(ss)
with open(logSys,'rb') as f:
smartdispl('content of the file '+logSys,f.read())
newLogSys = "%s_%s.csv" % (logSys,strftime("%Y_%m_%d_ %H_%M_%S"))
with open(logSys,'rb') as log, open(newLogSys,'wb') as nL:
nL.writelines(lineI.replace(':' if lineI[0]=='#' else ' ', ',')
for lineI in log)
with open(newLogSys,'rb') as f:
smartdispl('content of the file '+newLogSys,f.read())
结果
==================== content of the file poiu.txt ====================
'--|| ||:|||:||--||| \r\n'
'#10 23:30 abcdef : \r\n'
'802 12:25 xyz : \r\n'
'\r\n'
'#:35 11:18+14:39 sunny vale : sunny sea\r\n'
' 651454451 drh:hdb 54:1\r\n'
' \r\n'
': 541514 oi:npvert654165:8\r\n'
'#5415:v541564zervt\r\n'
'# :: \r\n'
'#::: :::\r\n'
' E\r\n'
======================================================================
======= content of the file poiu.txt_2012_02_07_ 00_48_55.csv ========
'--||,,||:|||:||--|||,\r\n'
'#10 23,30 abcdef , \r\n'
'802,12:25,xyz,,:,,\r\n'
'\r\n'
'#,35 11,18+14,39 sunny vale , sunny sea\r\n'
',,651454451,drh:hdb,54:1\r\n'
',,,,\r\n'
':,541514,oi:npvert654165:8\r\n'
'#5415,v541564zervt\r\n'
'# ,, \r\n'
'#,,, ,,,\r\n'
',E\r\n'
======================================================================
答案 2 :(得分:1)
使用@larsmans的建议并从写入部分删除代码重复:
# Read from file log and write to nLog file
for line in log:
if line.startswith("#"):
line = line.replace(':',',')
else:
line = line.replace(' ',',')
nL.write(line)
答案 3 :(得分:0)
如果你想要succintness,试试这个版本:
for line in log:
if line[0] == '#': line = ','.join(line.split(':'))
else: line = ','.join(line.split())
nL.write(line + '\n')