空字节错误,希望在python

时间:2017-11-27 13:12:01

标签: python-3.x csv

我正在使用python 3.

我正在读取dictreader中的CSV文件,并试图查看哪个国家/地区的发生率最高。

注意我使用的是dictreader,而不是读者。我认为这是必要的,因为我正在使用Counter。

我遇到了麻烦,因为我的CSV文件中的某些行有空字节(特别是在密码字段中),这会导致我的脚本因csv阅读器不喜欢空字节而导致错误。这方面的一个例子是我在下面的评论中的最后一个样本行。我看到有些人用我的代码中的行删除空字节:readerobject(x.replace('\0', '') for x in csvfile)但是我似乎无法使用它,因为我已经将csvfile读入前一行的readerobject中。

这是我的代码

'''
sample csv lines
Brazil,200.145.23.13,pi,raspberry,failed,None,None,None
Brazil,200.145.23.13,pi,raspberryraspberry993311,failed,None,None,None
China,121.201.83.134,root,123456,succeeded,None,None,None
United Kingdom,185.38.148.238,root,123456,succeeded,None,None,None
Croatia,5.188.10.141,root,admin,succeeded,None,None,None
France,195.154.44.31,squid,123456,failed,None,None,None
France,195.154.44.31,squid,123456,failed,None,None,None
Croatia,5.188.10.141,root,123456,succeeded,None,None,None
Croatia,5.188.10.141,root,admin,succeeded,None,None,None
Croatia,5.188.10.141,root,123456,succeeded,None,None,None
Netherlands,109.236.91.85,root,admin,succeeded,None,None,None
France,51.255.160.205,root,admin,succeeded,None,None,None
United States,207.138.132.44,root,seiko2005,failed,None,None,None
France,212.83.150.189,support,"       ",failed,None,None,None   <-- these are null bytes inside the ""  
'''


import codecs
from pprint import pprint  
from collections import Counter
import csv
linecount = 0
import time
country_counter = Counter()

print("parsing CSV log file")
with open('C:/Users/Home/Documents/kippo stuff/final lab/kippo/oldkippo4final.csv', newline='') as csvfile:
    readerobject = csv.DictReader(csvfile, delimiter=',', fieldnames=['Country', 'IP Address', 'Username', 'Password', 'Status', 'name', 'intention', 'OS'])
    readerobject(x.replace('\0', '') for x in csvfile)
    for row in readerobject:
        print(row, "\n\n")
        linecount +=1
        country_counter[row['Country']] +=1
        print(linecount)
print(country_counter.most_common(3))
print("the total linecount was: ", linecount)

0 个答案:

没有答案