所以我有一个文本文件,如下所示:
07,12,9201
07,12,9201
06,18,9209
06,18,9209
06,19,9209
06,19,9209
07,11,9201
我首先要删除所有重复的行,然后按升序对第1列进行排序,然后在给定第1列仍按升序的情况下按升序对第2列进行排序。 输出:
06,18,9209
06,19,9209
07,11,9201
07,12,9201
到目前为止,我已经尝试过:
with open('abc.txt') as f:
lines = [line.split(' ') for line in f]
考虑另一个示例:
00,0,6098
00,1,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,2,6098
00,20,6102
00,21,6087
00,22,6087
00,23,6087
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
此文件的输出应为:
00,0,6098
00,1,6098
00,2,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,20,6102
00,21,6087
00,22,6087
00,23,6087
答案 0 :(得分:0)
您可以执行以下操作。
from itertools import groupby, chain
from collections import OrderedDict
input_file = 'input_file.txt'
# Collecting lines
lines = [tuple(line.strip().split(',')) for line in open(input_file)]
# Removing dups and Sorting by first column
sorted_lines = sorted(set(lines), key=lambda x: int(x[0]))
# Grouping and ordering by second column
result = OrderedDict()
for k, g in groupby(sorted_lines, key=lambda x: x[0]):
result[k] = sorted(g, key = lambda x : int(x[1]))
print(result)
for v in chain(*result.values()):
print(','.join(v))
输出1:
06,18,9209
06,19,9209
07,11,9201
07,12,9201
输出2:
00,0,6098
00,1,6098
00,2,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,20,6102
00,21,6087
00,22,6087
00,23,6087