如何按三个具有特定顺序的列对文本文件进行排序?
我的text_file采用以下格式,列之间有空格:
Team_Name Team_Mascot Team_Color Team_Hometown Number_of_Wins Team_Coach
示例:
Bears LittleBear Blue Beartown 15 BigBear
我想首先按Team_Color排序,Team_Hometown第二,Number_of_Wins按升序排序。
因此属性:
Bears Blue Beartown 15 Coach1
Bears Red Dogtown 30 Coach6
Bears Blue Cattown 15 Coach2
Bears Red Beartown 15 Coach4
Bears Blue Cattown 17 Coach3
Bears Red Dogtown 9 Coach5
我的预期输出是一个已排序的文本文件:
Bears Blue Beartown 15 Coach1
Bears Blue Cattown 15 Coach2
Bears Blue Cattown 17 Coach3
Bears Red Beartown 15 Coach4
Bears Red Dogtown 9 Coach5
Bears Red Dogtown 30 Coach6
我考虑过使用lambda,但我的值都不是元组 https://docs.python.org/3/tutorial/controlflow.html
我查看过之前的StackOverflow问题,但大多数问题都是使用lambda,tuple或其他方法处理最多两列的
答案 0 :(得分:1)
你的问题仍然含糊不清。您的示例没有在标头中引入第一个字段Team_Name。所以这里的索引可能只有一个,但我认为你得到了这个概念:
#read lines of text file and split into words
lines = [line.split() for line in open("test.txt", "r")]
#sort lines for different columns, numbers converted into integers to prevent lexicographical sorting
lines.sort(key = lambda x: (x[1], x[2], int(x[3])))
#writing the sorted list into another file
with open("new_test.txt", "w") as f:
for item in lines:
f.write(" ".join(item) + "\n")
答案 1 :(得分:1)
您可以使用itemgetter
模块中的operator
执行此操作:
from operator import itemgetter
def showList(inList):
for i in inList:
print(i)
lines = []
with open("test.txt", "r") as infile:
lines = [i.split() for i in infile.readlines()]
lines = [[int(j) if j.isdigit() else j for j in i] for i in lines]
showList(lines)
lines = sorted(lines, key=itemgetter(1,2,3))
print()
showList(lines)
with open("output.txt", "w") as outfile:
for line in lines:
outfile.write(" ".join(str(i) for i in line) + "\n")
输出(使用showList
):
['Bears', 'Blue', 'Beartown', 15, 'Coach1']
['Bears', 'Red', 'Dogtown', 30, 'Coach6']
['Bears', 'Blue', 'Cattown', 15, 'Coach2']
['Bears', 'Red', 'Beartown', 15, 'Coach4']
['Bears', 'Blue', 'Cattown', 17, 'Coach3']
['Bears', 'Red', 'Dogtown', 9, 'Coach5']
['Bears', 'Blue', 'Beartown', 15, 'Coach1']
['Bears', 'Blue', 'Cattown', 15, 'Coach2']
['Bears', 'Blue', 'Cattown', 17, 'Coach3']
['Bears', 'Red', 'Beartown', 15, 'Coach4']
['Bears', 'Red', 'Dogtown', 9, 'Coach5']
['Bears', 'Red', 'Dogtown', 30, 'Coach6']
新文件中的输出格式:
Bears Blue Beartown 15 Coach1
Bears Blue Cattown 15 Coach2
Bears Blue Cattown 17 Coach3
Bears Red Beartown 15 Coach4
Bears Red Dogtown 9 Coach5
Bears Red Dogtown 30 Coach6