我有一个包含以下条目的csv文件:
Year,Month,Company A, Company B,Company C, .............Company N
1990, Jan, 10, 15, 20, , ..........,50
1990, Feb, 10, 15, 20, , ..........,50
我正在尝试为公司A排序csv文件数据,依此类推,直到公司N.
我的一段代码适用于循环中的第一次运行,但是第二次运行失败。
try:
reader = csv.DictReader(open(self.filename,'r')) #Try and open the file with csv dictreader
except IOError:
print "Error Opening File -- Check if file exists"
ncols = reader.next()
print ncol.keys()
for key in ncols.keys():
if key != 'Month' and key != 'Year':
print key
result = sorted(reader, key=lambda d: float(d[key]))
result = result[-1]
#print "Year " ,
print result['Year'],
#print "Month ",
print result ['Month'],
print key,
print result[key]
输出:
Company-E
2008 Oct Company-E 997
Company-D
Traceback (most recent call last):
File "<pyshell#105>", line 1, in <module>
read.ParseData()
File "C:/Users/prince/Desktop/CsvRead.py", line 55, in ParseData
result = result[-1]
IndexError: list index out of range
答案 0 :(得分:3)
我建议使用pandas:
import pandas
df = pandas.read_csv(filename)
for col in df.columns:
if col != 'Month' and col != 'Year':
df = df.sort(col)
df.to_csv(out_filename, index=False)
答案 1 :(得分:0)
代码确实可以添加两行: 我需要将文件倒回到初始位置。
fh.seek(0)
fh.next()
这是代码的工作部分:
actualResult = {}
try:
fh = open(filename,'r')
reader = csv.DictReader(fh) #Try and open the file with csv dictreader
#Get the field names in the file:
fields = set(reader.fieldnames)
if not fields or ('Year' not in fields and
'Month' not in fields):
raise BadInputFile(filename)
companies = fields - {'Year', 'Month'}
print companies
for name in companies:
#sorting the csv file data based on column data with Company Name as Key
result = sorted(reader, key=lambda d: float(d[name]), reverse=True)
result = result[0]
tup = (result[name],result['Year'],result['Month'])
if name not in actualResult.keys():
actualResult.update({str(name): tup})
else:
raise BadInputFile(filename)
fh.seek(0) #rewinding the file to initial position
fh.next() #Moving to the 1st row
except (IOError, BadInputFile) as e:
print "Error: ", str(e) # Invalid input file
raise
return actualResult