我必须按升序将xls文件内容按4列排序。
我将xls文件内容转换为列表列表。以下是输入
输入:
data = """ABC, Do not Consider1, 101, Title and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 99, BIC Codes, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 98, Title and Subtitle, Do not Consider2, 25/12/2015
ABC, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 30/12/2015"""
字符串格式的相应输出:
data = """ABC, Do not Consider1, 98, Title and Subtitle, Do not Consider2, 25/12/2015
ABC, Do not Consider1, 99, BIC Codes, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 101, Title and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
"""
首先我将数据拆分为列表格式:
# Split data to list.
>>> data_list = [i.split(", ") for i in data.split("\n")]
>>> print "\n".join([", ".join(i) for i in data_list])
ABC, Do not Consider1, 101, Title and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 99, BIC Codes, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 98, Title and Subtitle, Do not Consider2, 25/12/2015
ABC, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 30/12/2015
以下是排序要求:
- We have to sort by index0 ,
if index0 have same values for multiple items then sort by Index2
if index0 and index2 are same for multiple items then sort by Index3
if index0, index2 and index3 are same for multiple items then sort by Index5
我的逻辑是
代码:
>>> from collections import defaultdict
>>> data_dict = defaultdict(list)
>>> for i in data_list:
... key = "%s%s%s%s"%(i[0].strip(), i[2].strip(), i[3].strip(), i[5].strip())
... data_dict[key].append(i)
...
>>> sorted_keys = sorted(data_dict.keys())
>>>
>>> for i in sorted_keys:
... for j in data_dict[i]:
... print j
...
...
['ABC', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']
['ABC', 'Do not Consider1', '100', 'Title and Subtitle', 'Do not Consider2', '30/12/2015']
['ABC', 'Do not Consider1', '100', 'Title and Subtitle', 'Do not Consider2', '31/12/2015']
['ABC', 'Do not Consider1', '101', 'Title and Subtitle', 'Do not Consider2', '30/12/2015']
['ABC', 'Do not Consider1', '98', 'Title and Subtitle', 'Do not Consider2', '25/12/2015 ']
['ABC', 'Do not Consider1', '99', 'BIC Codes', 'Do not Consider2', '31/12/2015']
['XYZ', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']
['XYZ', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']
但是Index2中有数字,即第2列和Index5中的Date,即第5列,所以不能获得排序数据。
你能帮我解决这个问题吗?
答案 0 :(得分:1)
您可以使用sorted
功能按多个键排序,如下所示: -
sorted_list = sorted(data_list, key=lambda item: (item[0], int(item[2]), item[3]))
print "\n".join([", ".join(i) for i in sorted_list])
返回
ABC, Do not Consider1, 98, Title and Subtitle, Do not Consider2, 25/12/2015
ABC, Do not Consider1, 99, BIC Codes, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 101, Title and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
诀窍是让你的key
lambda返回一个包含要排序的所有值的元组,并使用int()
函数将第三列的值转换为整数。
答案 1 :(得分:1)
您应该可以通过一次sorted()
电话完成所需的操作。 csv
模块可用于解析数据:
import csv
import StringIO
from itertools import groupby
data = """ABC, Do not Consider1, 101, Title and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 99, BIC Codes, Do not Consider2, 31/12/2015
ABC, Do not Consider1, 98, Title and Subtitle, Do not Consider2, 25/12/2015
ABC, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
XYZ, Do not Consider1, 100, ATitle and Subtitle, Do not Consider2, 30/12/2015
ABC, Do not Consider1, 100, Title and Subtitle, Do not Consider2, 30/12/2015"""
csv_input = csv.reader(StringIO.StringIO(data), skipinitialspace=True)
rows = sorted(list(csv_input), key=lambda x: (x[0], int(x[2]), x[3], x[5]))
for row in rows:
print row
这将为您提供以下内容:
['ABC', 'Do not Consider1', '98', 'Title and Subtitle', 'Do not Consider2', '25/12/2015 ']
['ABC', 'Do not Consider1', '99', 'BIC Codes', 'Do not Consider2', '31/12/2015']
['ABC', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']
['ABC', 'Do not Consider1', '100', 'Title and Subtitle', 'Do not Consider2', '30/12/2015']
['ABC', 'Do not Consider1', '100', 'Title and Subtitle', 'Do not Consider2', '31/12/2015']
['ABC', 'Do not Consider1', '101', 'Title and Subtitle', 'Do not Consider2', '30/12/2015']
['XYZ', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']
['XYZ', 'Do not Consider1', '100', 'ATitle and Subtitle', 'Do not Consider2', '30/12/2015']