Question

我有一个包含以下条目的csv文件：

    Year,Month,Company A, Company B,Company C, .............Company N
    1990, Jan, 10, 15, 20, , ..........,50
    1990, Feb, 10, 15, 20, , ..........,50

我正在尝试为公司A排序csv文件数据，依此类推，直到公司N.

我的一段代码适用于循环中的第一次运行，但是第二次运行失败。

    try:
        reader = csv.DictReader(open(self.filename,'r')) #Try and open the file with csv dictreader
    except IOError:
        print "Error Opening File -- Check if file exists"

    ncols = reader.next()
    print ncol.keys()
    for key in ncols.keys():
        if key != 'Month' and key != 'Year':
            print key
            result = sorted(reader, key=lambda d: float(d[key]))
            result = result[-1]
            #print "Year " ,
            print result['Year'],
            #print "Month ",
            print result ['Month'],
            print key,
            print result[key]

输出：

    Company-E
    2008 Oct Company-E 997
    Company-D

    Traceback (most recent call last):
    File "<pyshell#105>", line 1, in <module>
    read.ParseData()
    File "C:/Users/prince/Desktop/CsvRead.py", line 55, in ParseData
    result = result[-1]
    IndexError: list index out of range

Answer 1

我建议使用pandas：

import pandas
df = pandas.read_csv(filename)
for col in df.columns:
    if col != 'Month' and col != 'Year':
        df = df.sort(col)
df.to_csv(out_filename, index=False)

Answer 2

代码确实可以添加两行：我需要将文件倒回到初始位置。

fh.seek（0）

fh.next（）

这是代码的工作部分：

        actualResult = {}
        try:
            fh = open(filename,'r')
            reader = csv.DictReader(fh) #Try and open the file with csv dictreader

            #Get the field names in the file:
            fields = set(reader.fieldnames)
            if not fields or ('Year' not in fields and 
            'Month' not in fields):
                raise BadInputFile(filename)
            companies = fields - {'Year', 'Month'}
            print companies
            for name in companies:
                #sorting the csv file data based on column data with Company Name as Key
                result = sorted(reader, key=lambda d: float(d[name]), reverse=True)
                result = result[0]
                tup = (result[name],result['Year'],result['Month'])
                if name not in actualResult.keys():
                    actualResult.update({str(name): tup})
                else:
                    raise BadInputFile(filename)
                fh.seek(0) #rewinding the file to initial position
                fh.next()  #Moving to the 1st row
        except (IOError, BadInputFile) as e:
            print "Error: ", str(e) # Invalid input file
            raise


        return actualResult

对csv文件中的每一列进行排序

2 个答案: