我有数据,例如:
2017/06/07 10:42:35,THREAT,url,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423523,,web-browsing,80,tcp,block-url
2017/06/07 10:43:35,THREAT,url,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,allow
2017/06/07 10:43:36,THREAT,end,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,block-url
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,allow
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.103,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,block-url
如何解析只获取所有行中的数据列4,5,7和12?
这是我的代码:
import csv
file=open('filename.log', 'r')
f=open('fileoutput', 'w')
lines = file.readlines()
for line in lines:
result.append(line.split(' ')[4,5,7,12])
f.write (line)
f.close()
file.close()
答案 0 :(得分:3)
使用csv.reader
和csv.writer
个对象的正确方法:
import csv
with open('filename.log', 'r') as fr, open('filoutput.csv', 'w', newline='') as fw:
reader = csv.reader(fr)
writer = csv.writer(fw)
for l in reader:
writer.writerow(v for k,v in enumerate(l, 1) if k in (4,5,7,12))
filoutput.csv
内容:
192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url
答案 1 :(得分:1)
这是错误的:
line.split(' ')[4,5,7,12]
你想要这个:
fields = line.split(' ')
fields[4], fields[5], fields[7], fields[12]
答案 2 :(得分:1)
使用pandas
import pandas as pd
df = pd.read_csv('filename.log', sep=',', header=None, index_col=False)
df[[3, 4, 6, 11]].to_csv('fileoutput.csv', header=False, index=False)
请注意,使用[3, 4, 6, 11]
代替[4, 5, 7, 12]
来计算数据框列中的0索引。
fileoutput.csv
的内容:
192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url
答案 3 :(得分:0)
您正走在正确的道路上,但您的语法已关闭。以下是使用csv
模块的示例:
import csv
log = open('filename.log')
# newline='\n' to prevent csv.writer to include additional newline when writing to file
log_write = open('fileoutput', 'w', newline='\n')
csv_log = csv.reader(log, delimiter=',')
csv_writer = csv.writer(log_write, delimiter=',')
for line in csv_log:
csv_writer.writerow([line[0], line[1], line[2], line[3]]) # output first 4 columns
log.close()
log_write.close()
答案 4 :(得分:0)
查看列表压缩,你可以得到类似的东西而不必使用csv模块
file=open('filename.log','r')
f=open('fileoutput', 'w')
lines = file.readlines()
for line in lines:
f.write(','.join(line.split(',')[i] for i in [3,4,6,11]))
f.close()
file.close()
请注意,基于零索引的列表
的索引是3,4,6,11输出
cat fileoutput
192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url