Python基于列值从最高到最低排序文本文件

时间:2012-11-27 03:58:57

标签: python sorting

我有一个非常大的文本文件,其中包含以下数据:

('#DownWithAssad', '1')
('#DownYoTLParty', '1')
('#Download', '8')
('#Download:', '2')
('#Downloads', '2')
('#DownstairsMixtape', '1')
('#DowntonAbbey', '12')
('#DowntonAbbey?', '1')
('#DowntonPBS', '23')
('#Downtonabbey', '1')
('#DowntownAbbey', '1')

这可能看起来像一个简单的问题,但我想将数据从最高到最低软化,所以它看起来像:

('#DowntonPBS', '23')
('#DowntonAbbey', '12')
('#Download', '8')
('#Download:', '2')
('#Downloads', '2')
('#DownstairsMixtape', '1')
('#DownWithAssad', '1')
('#DownYoTLParty', '1')
('#DowntonAbbey?', '1')
('#Downtonabbey', '1')
('#DowntownAbbey', '1')

我认为我可以删除括号()并将数据拆分为:

import sys

f = open(sys.argv[1])
for line in f:
    line = str(line)[1 : -1]
    for sect in line.split(','):
        print sect

但是我不确定从哪里开始。

2 个答案:

答案 0 :(得分:4)

您可以使用ast.literal_eval

轻松解析文本文件
with open(datafile) as f:
    file_sorted = sorted((ast.literal_eval(x) for x in f),
                         key=lambda z:(int(z[1]),z[0]),
                         reverse=True)

工作原理:

(ast.literal_eval(x) for x in f)  #turn each line in your file into a tuple
key=lambda z:(int(z[1]),z[0])     #function to determine how things are sorted.  Basically
                                  #sort as tuples:  `( int(z[1]),z[0] )`
reverse=True                      #descending order instead of ascending

答案 1 :(得分:1)

这与你想要做的事情是一致的。请注意,以这种方式解析这些行是非常脆弱的(错误格式化的行可能会破坏它)

from operator import itemgetter
import sys

result=[]
with open(sys.argv[1]) as f:
    for line in f:
        line = str(line.strip())[1: -1]
        sect1, sect2 = line.split(', ')
        sect1 = sect1[1: -1]
        sect2 = int(sect2[1: -1])
        result.append((sect1, sect2))

for line in sorted(result, key=itemgetter(1), reverse=True):
    print line

更好的解析方法是使用literal_eval或正则表达式。当引号字符或逗号出现在字符串中时,您知道是否有任何特殊处理吗?