Question

我有一个包含以下内容的文件：

降低了可读性：

Title,            Author,        Publisher,  Year,  ISBN-10,   ISBN-13
Automate the...,  Al Sweigart,   No Sta...,  2015,  15932...,  978-15932...
Dive into Py...,  Mark Pilgr..., Apress,     2009,  14302...,  978-14302...
"Python Cook...,  "David Bea..., O'Reil...,  2013,  14493...,  978-14493...
Think Python...,  Allen B. D..., O'Reil...,  2015,  14919...,  978-14919...
"Fluent Pyth...,  Luciano Ra..., O'Reil...,  2015,  14919...,  978-14919...

完整：

Title,Author,Publisher,Year,ISBN-10,ISBN-13
Automate the Boring Stuff with Python,Al Sweigart,No Starch Press,2015,1593275994,978-1593275990
Dive into Python 3,Mark Pilgrim,Apress,2009,1430224150,978-1430224150
"Python Cookbook, Third edition","David Beazley, Brian K Jones",O'Reilly Media,2013,1449340377,978-1449340377
Think Python: How to Think Like a Computer Scientist,Allen B. Downey,O'Reilly Media,2015,1491939362,978-1491939369
"Fluent Python: Clear, Concise, and Effective Programming",Luciano Ramalho,O'Reilly Media,2015,1491946008,978-1491946008

除了第二列（作者）按字母顺序排列外，我想读取文件并写入包含这些相同内容的新文件。标头（第一行）不应更改。有关如何执行此操作的任何想法？作者顺序如下：

Al
Allen
David
Luciano
Mark

编辑：很抱歉没有提到这个，但是我不能使用熊猫。此外，必须在第二列的基础上一起调整所有列。我很抱歉将这些信息解释给大家。

编辑：我编写了以下函数，该函数可打印所需的排序结果，但在要将数据写入新文件时不起作用：

import sys, csv, operator
data = csv.reader(open('books.csv'),delimiter=',')
header = next(data)
print (header)
sortedlist = sorted(data, key=operator.itemgetter(1))
with open("books_sort.csv", "wb") as f:
#          fileWriter = csv.writer(f, delimiter=',')
           fileWriter = csv.writer(f)
#           fileWriter.writerows(header)
#           fileWriter.writerows(sortedlist)

           for row in sortedlist:
              print (row)
#             f.writerows(row)

Answer 1

熊猫非常适合：

important pandas as pd
data = pd.read_csv(‘file.csv’, sep=‘,’)
sorted = data.sort_values(by=[‘Author’])
sorted.to_csv(‘outfile.csv’, index=False)

mplab-c18，read_csv，sort_values

的文档

Answer 2

使用pandas：

import pandas as pd

df = pd.read_csv('file.csv')
sorted = df.sort_values('Author')
sorted.to_csv('result.csv', index=False)

Answer 3

非pandas解决方案涉及读取第二列的文本排序：

import csv
with open('books_and_authors.csv') as f1:
  header, *data = csv.reader(f1)
  with open('books_and_authors.csv', 'w') as f2:
    write = csv.writer(f2)
    write.writerows([header, *sorted(data, key=lambda x:x[1])])

输出：

Title,Author,Publisher,Year,ISBN-10,ISBN-13
Automate the Boring Stuff with Python,Al Sweigart,No Starch Press,2015,1593275994,978-1593275990
Think Python: How to Think Like a Computer Scientist,Allen B. Downey,O'Reilly Media,2015,1491939362,978-1491939369
"Python Cookbook, Third edition","David Beazley, Brian K Jones",O'Reilly Media,2013,1449340377,978-1449340377
"Fluent Python: Clear, Concise, and Effective Programming",Luciano Ramalho,O'Reilly Media,2015,1491946008,978-1491946008
Dive into Python 3,Mark Pilgrim,Apress,2009,1430224150,978-1430224150

如何仅按字母顺序排列文件的特定列

3 个答案: