如何在某些行和列中使用python获取数据

时间:2017-09-23 09:10:59

标签: python python-2.7

我有数据,例如:

2017/06/07 10:42:35,THREAT,url,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423523,,web-browsing,80,tcp,block-url
2017/06/07 10:43:35,THREAT,url,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,allow
2017/06/07 10:43:36,THREAT,end,192.168.1.100,52.25.xxx.xxx,Rule-VWIRE-03,13423047,,web-browsing,80,tcp,block-url
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.101,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,allow
2017/06/07 10:44:09,TRAFFIC,end,192.168.1.103,52.25.xxx.xxx,Rule-VWIRE-03,13423111,,web-browsing,80,tcp,block-url

如何解析只获取所有行中的数据列4,5,7和12?

这是我的代码:

import csv

file=open('filename.log', 'r')
f=open('fileoutput', 'w')
lines = file.readlines()

        for line in lines:
        result.append(line.split(' ')[4,5,7,12])
        f.write (line)

f.close()
file.close()

5 个答案:

答案 0 :(得分:3)

使用csv.readercsv.writer个对象的正确方法:

import csv

with open('filename.log', 'r') as fr, open('filoutput.csv', 'w', newline='') as fw:
    reader = csv.reader(fr)
    writer = csv.writer(fw)
    for l in reader:
        writer.writerow(v for k,v in enumerate(l, 1) if k in (4,5,7,12))

filoutput.csv内容:

192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url

答案 1 :(得分:1)

这是错误的:

line.split(' ')[4,5,7,12]

你想要这个:

fields = line.split(' ')
fields[4], fields[5], fields[7], fields[12]

答案 2 :(得分:1)

使用pandas

的解决方案
import pandas as pd

df = pd.read_csv('filename.log', sep=',', header=None, index_col=False)
df[[3, 4, 6, 11]].to_csv('fileoutput.csv', header=False, index=False)

请注意,使用[3, 4, 6, 11]代替[4, 5, 7, 12]来计算数据框列中的0索引。

fileoutput.csv的内容:

192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url

答案 3 :(得分:0)

您正走在正确的道路上,但您的语法已关闭。以下是使用csv模块的示例:

import csv
log = open('filename.log')
# newline='\n' to prevent csv.writer to include additional newline when writing to file
log_write = open('fileoutput', 'w', newline='\n')
csv_log = csv.reader(log, delimiter=',')
csv_writer = csv.writer(log_write, delimiter=',')

for line in csv_log:
    csv_writer.writerow([line[0], line[1], line[2], line[3]]) # output first 4 columns

log.close()
log_write.close()

答案 4 :(得分:0)

查看列表压缩,你可以得到类似的东西而不必使用csv模块

file=open('filename.log','r') 
f=open('fileoutput', 'w')
lines = file.readlines()
for line in lines:
    f.write(','.join(line.split(',')[i] for i in [3,4,6,11]))

f.close()
file.close()

请注意,基于零索引的列表

的索引是3,4,6,11

输出

cat fileoutput 
192.168.1.100,52.25.xxx.xxx,13423523,block-url
192.168.1.101,52.25.xxx.xxx,13423047,allow
192.168.1.100,52.25.xxx.xxx,13423047,block-url
192.168.1.101,52.25.xxx.xxx,13423111,allow
192.168.1.103,52.25.xxx.xxx,13423111,block-url