基于价值的排序

时间:2017-09-03 21:46:39

标签: python json python-2.7

我需要根据下面的频率计数对单词进行排序。

在清理停用词后拆分单词:

words=Counter([item for sublist in m.split('\W+') for item in word_tokenize(sublist)])

频率计数:

wordsFreq=['%s: %d' %(x, words[x]) for x in words]

输出:

["limited: 1", "desirable: 1", "advices: 1","new: 8", "net: 5", "increasing: 2",......]

print type(wordsFreq)

输出

<type 'list'>

1 个答案:

答案 0 :(得分:0)

一种方法是将数据转换为字典,字词为键,频率为值:

import operator

in_lst = ["limited: 1", "desirable: 1", "advices: 1",
             "new: 8", "net: 5", "increasing: 2"]

freq_dict = {x[0]: x[1] for x in [i.split(": ") for i in in_lst]}

sorted_lst = sorted(freq_dict.items(), key=operator.itemgetter(1))

out_lst = [": ".join(i) for i in sorted_lst]

然后,该程序根据字典中的值对项目进行排序。 sorted_lst是一个元组列表,然后转换为原始字符串列表,按其频率按递增顺序排序。

另一种解决方案是使用OrderedDict模块中的collections