我如何按第n列对输出的文本文件进行排序 - Python

时间:2017-10-23 02:17:26

标签: python

我的代码:

infile = open("ALE.txt", "r")
outfile = open("ALE_sorted.txt", "w")

for line in infile:
    data = line.strip().split(',')
    wins = int(data[2])
    percentage = 162 / wins
    p = str(data[0]) + ", " + data[1] + ", " + data[2] + ", " + 
str(round(percentage, 3)) + "\n"
    outfile.write(p)
infile.close()
outfile.close()

原始的infile(“ALE.txt”)只是下面的前三列。从上面的代码输出的文本文件如下所示:

巴尔的摩,93,69,2.348 波士顿,69,93,1.742
纽约,95,67,2.418
坦帕湾,90,72,2.25 多伦多,73,89,1.82

我知道代码正确计算了胜率(第2列/总胜数),但我想按第4列(胜率)对此列表进行排序。

4 个答案:

答案 0 :(得分:1)

将您的数据附加到列表中,例如d

使用列表的第三项(第4列)对其进行排序。参考 - operator.itemgetter

将已排序的数据写入输出文件。

输入文件的内容

[kiran@localhost ~]$ cat infile.txt
Baltimore, 93, 69
Boston, 69, 93
New York, 95, 67
Tampa Bay, 90, 72
Toronto, 73, 89

<强>代码::

>>> from operator import itemgetter
>>> d=[]
>>> with open('infile.txt','r') as infile:
...     for line in infile.readlines():
...             data = line.strip().split(',')
...             wins = int(data[2])
...             percentage = 162 / float(wins)
...             data.append(str(round(percentage, 3))) #add percentage to your list that already contains the name and two scores.
...             d.append(data) # add the line to a list `d`
...
>>> print d
[['Baltimore', ' 93', ' 69', '2.348'], ['Boston', ' 69', ' 93', '1.742'], ['New York', ' 95', ' 67', '2.418'], ['Tampa Bay', ' 90', ' 72', '2.25'], ['Toronto', ' 73', ' 89', '1.82']]
>>> d.sort(key=itemgetter(3)) #sort the list `d` with the third item(4th column) of your sublist.
>>> print d
[['Boston', ' 69', ' 93', '1.742'], ['Toronto', ' 73', ' 89', '1.82'], ['Tampa Bay', ' 90', ' 72', '2.25'], ['Baltimore', ' 93', ' 69', '2.348'], ['New York', ' 95', ' 67', '2.418']]
>>> #write the items in list d to your output file
>>>
>>> with open('outfile.txt','w') as outfile:
...     for line in d:
...             outfile.write(','.join(line)+'\n')
...
>>>

输出文件的内容:

[kiran@localhost ~]$ cat outfile.txt
Boston, 69, 93,1.742
Toronto, 73, 89,1.82
Tampa Bay, 90, 72,2.25
Baltimore, 93, 69,2.348
New York, 95, 67,2.418

答案 1 :(得分:0)

试试这个:

infile  = open("ALE.txt", "r")
outfile = open("ALE_sorted.txt", "w")

master_data = []

# Load in data from the infile and calculate the win percentage.
for line in infile:

    data = line.strip().split(', ')

    wins = int(data[2])
    percentage = 162 / wins
    data.append(str(round(percentage, 3)))

    master_data.append(data)

# Sort by the last column in reverse order by value and store the 
# sorted values and original indices in a list of tuples.
sorted_column = sorted([(float(data[-1]), index) for index, data in \
                        enumerate(master_data)], reverse = True)

# Reassign master_data according to the sorted positions.
master_data   = [master_data[data[1]] for data in sorted_column]

# Write each line to the outfile.
for data in master_data:

    outfile.write(str(", ".join(data) + "\n"))

infile.close()
outfile.close()

infile的内容如下:

Baltimore, 93, 69
Boston, 69, 93
New York, 95, 67
Tampa Bay, 90, 72
Toronto, 73, 89

结果outfile包含以下内容:按从最高到最低的新生成的第四列的值排序:

New York, 95, 67, 2.418
Baltimore, 93, 69, 2.348
Tampa Bay, 90, 72, 2.25
Toronto, 73, 89, 1.82
Boston, 69, 93, 1.742

答案 2 :(得分:0)

首先,处理此问题时,最好使用line.split(&#39;,&#39;)。strip()。

import csv
with open('ALE.txt', 'r') as infile:
    reader = csv.reader(infile)
    data = []
    for line in reader:
        formatted_line = [i.strip() for i in line]
        wins = int(formatted_line[2])
        percentage = 100*wins/total_wins
        formatted_line.append(str(round(percentage,3)))
        data.append(formatted_line)
    data = sorted(p, lambda x: x[3])
with open('ALE_sorted.txt', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    writer.writerows(data)

答案 3 :(得分:0)

排序第4列的最佳方法是使用pandas打开文件。 这是如何做到的:

import pandas as pd

outfile=pd.read_csv("ALE_sorted.txt")
column=outfile.columns.values.tolist()  # will give you the name of your column

#It will return [0L,1L,2L,3L] where 3L is your fourth column and refers to a long int.

outfile.sort_values(by=[3L])

print(outfile.3L)  # to see the sorted column

这将产生:

3L
1.742
1.82
2.25
2.348
2.418