我正在使用dictread来读取cvs文件。我的问题是,对于某些文件,我得到一个空键。我只有6个键:下面是我的代码和文件:
from datetime import datetime
from math import sqrt, exp, log
from csv import DictReader
import pandas as pd
import numpy as np
train = '/Users/mas/Documents/workspace/Avito/input/minitrain.csv'
for t,row in enumerate(DictReader(open(train))):
pass
print row
这是我的输出
{'': None, 'SearchID': '4', 'IsClick': None, 'HistCTR': '', 'AdID': '24129570', 'Position': '2', 'ObjectType': '2'}
这是我的cvs文件
SearchID,AdID,Position,ObjectType,HistCTR,IsClick,
2,11441863,1,3,0.001804,0,
2,22968355,7,3,0.004723,0,
3,212187,7,3,0.029701,0,
3,34084553,1,3,0.004300,0,
3,36256251,2,2,,,
4,2073399,6,1,,,
4,6046052,7,1,,,
4,17544913,8,1,,,
4,20653823,1,3,0.003049,0,
4,24129570,2,2,,,
我能得到一把空钥匙吗?!
答案 0 :(得分:1)
尝试在读取csv文件时设置字段名:
DictReader(open(train), fieldnames=('SearchID', 'AdID', 'Position', 'ObjectType', 'HistCTR', 'IsClick',))
您可以根据DictReader
编写自己的DictReader:
class MyDictReader():
def __init__(self, f, fieldnames=None, dialect='excel', *args, **kwrags):
self.reader = csv.reader(f, dialect, *args, **kwrags)
self._fieldnames = fieldnames
if self._fieldnames is None:
try:
self._fieldnames = next(self.reader)
except StopIteration:
pass
def __iter__(self):
return self
def next(self):
d = {}
row = self.reader.next()
for index, fieldname in enumerate(self._fieldnames):
if fieldname:
d[fieldname] = row[index]
return d
使用后:
for t, row in enumerate(MyDictReader(open(train))):
pass
print row
您将获得没有空键的输出:
{'SearchID': '4', 'IsClick': '', 'HistCTR': '', 'AdID': '24129570', 'Position': '2', 'ObjectType': '2'}