在python中使用csv工具对txt文件进行排序的最佳方法

时间:2017-07-20 17:38:19

标签: python file csv sorting

我有以下代码,并尝试以最简单的方法对文件内容进行排序。

import csv
import operator

#==========Search by ID number. Return Just the Name Fields for the Student
with open("studentinfo.txt","r") as f:
  studentfileReader=csv.reader(f)
  id=input("Enter Id:")
  for row in studentfileReader:
    for field in row:
      if field==id:
        currentindex=row.index(id)
        print(row[currentindex+1]+" "+row[currentindex+2])

#=========Sort by Last Name
with open("studentinfo.txt","r") as f:
  studentfileReader=csv.reader(f)
  sortedlist=sorted(f,key=operator.itemgetter(0),reverse=True)
  print(sortedlist)

我知道各种可能的解决方案,但不能让它们正常运行,并且出于教学/学习的目的,也会对最简单有效的解决方案感兴趣并有明确的解释。

研究包括: ****进口经营者**** sortedlist = sorted(reader,key = operator.itemgetter(3),reverse = True)

或使用lambda sortedlist = sorted(reader,key = lambda row:row [3],reverse = True)

对于ANSWER,如果有人能够发布完整的解决方案,显示按最后名称排序和ID号,我将不胜感激,以说明两个不同的例子。答案的扩展将显示如何在此特定示例中按多个值排序:

完整代码列表:

https://repl.it/Jau3/3

文件内容

002,Ash,Smith,Test1:20,Test2:20,Test3:100003
004,Grace,Asha,Test1:33,Test2:54,Test3:23
005,Cat,Zelch,Test1:66,Test2:22,Test3:11
001,Joe,Bloggs,Test1:99,Test2:100,Test3:1
003,Jonathan,Peter,Test1:99,Test2:33,Test3:44

3 个答案:

答案 0 :(得分:1)

您可以使用lambda函数按csv阅读器返回的列表中的任意键进行排序,例如,通过姓氏(第三列):

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: x[2])  # use the third column as a sorting key
    print("\n".join(str(row) for row in sorted_list))  # prettier print

或ID(第一栏):

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: x[0])  # the first column as a sorting key, can be omitted
    print("\n".join(str(row) for row in sorted_list))  # prettier print

或者通过两个键:

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=lambda x: (x[3], x[4]))  # use fourth and fifth column
    print("\n".join(str(row) for row in sorted_list))  # prettier print

您可以将reverse=True添加到list.sort()来电,以进行降序排序。

ADDENUM - 如果你真的不想使用lambdas(为什么?),你可以定义一个item-getter函数(或者只使用为此目的而存在的operator.itemgetter)并将其传递给list.sort()电话,例如:

def get_third_column(x):
    return x[2]

with open("studentinfo.txt", "r") as f:
    reader = csv.reader(f)
    sorted_list = list(reader)  # turn the reader iterator into a list
    sorted_list.sort(key=get_third_column)  # use the third column as a sorting key
    print("\n".join(str(row) for row in sorted_list))  # prettier print

答案 1 :(得分:1)

紧凑,简单的阅读解决方案 - >排序 - >写:

import csv
import operator

with open("input.csv") as fh:
    reader = csv.reader(fh)
    rows = sorted(reader, key=operator.itemgetter(0), reverse=True)

with open("output.csv", "w") as fh:
    csv.writer(fh).writerows(rows)

要在控制台上打印而不是写入文件, 您可以使用sys.stdout作为文件句柄:

import sys

with sys.stdout as fh:
    csv.writer(fh).writerows(rows)

operator.itemgetter(0)确定要排序的字段。 第0个字段是id。 要按姓氏排序,请使用operator.itemgetter(2), 因为姓氏是第3列。

要按多个字段排序,您需要使用lambda,例如按姓氏排序,然后按名字排序:

    rows = sorted(reader, key=lambda x: (x[2], x[1]), reverse=True)

排序前的代码,您要求用户输入ID, 也可以改进:

  • 当您知道id字段是第一个
  • 时,不需要对每个字段进行迭代
  • id隐藏了Python中的内置函数,因此不建议将其用作变量

你可以这样写:

with open("studentinfo.txt") as fh:
    reader = csv.reader(fh)
    student_id = input("Enter Id:")
    for row in reader:
        if row[0] == student_id:
            print(row[1] + " " + row[2])

答案 2 :(得分:0)

使用导入运算符,正如您所做的那样,以及一种可能的解决方案:注意 - 您理想情况下需要一个标题来区分要排序的内容(假设用户希望明确指定)

import csv
import operator
ifile =open('myfile.csv', 'rb')
infile = csv.reader(ifile)
# Note that if you have a header, this is the header line
infields = infile.next()
startindex = infields.index('Desired Header')
# Here you are creating the sorted list
sortedlist = sorted(infile, key=operator.itemgetter(startindex), reverse=True)
ifile.close
# open the output file - it can be the same as the input file
ofile = open('myoutput.csv, 'wb')
outfile.writerow(infields)
for row in sortedlist:
  outfile.writerow(row)
ofile.close()