我一直试图按名称,后缀和长度按字母顺序排序金属列表,但似乎只能按长度排序。我不确定我哪里出错了。
metals.csv
list of names with date and suffix
name,date,suffix
copper.abc,2017-10-06,abc
gold.xyz,2017-10-06,xyz
19823.efg,2017-10-06,efg
silver.abc,2017-10-06,abc
iron.efg,2017-10-06,efg
unknown9258.xyz,2017-10-06,xyz
nickel.xyz,2017-10-06,xyz
bronze.abc,2017-10-06,abc
platinum.abc,2017-10-06,abc
unknown--23.efg,2017-10-06,efg
filter_sort.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
import enchant
import re
from operator import itemgetter, attrgetter
pattern = re.compile(u"([^0-9-]+\..*),(.*,.*)", flags=re.UNICODE)
original = open('metals.csv', 'r')
with open('output.txt', 'a') as newfile:
for line in original.readlines():
m = pattern.match(line)
if m:
repl = m.group(1)
newfile.write(m.group(1)+"\n")
newfile.close()
d = enchant.Dict("en_US")
output = []
infile = open("output.txt", "r")
with open('filtered.txt', 'a') as filtered:
for line in infile.readlines():
word = line.strip('\n').split('.')[0]
if d.check(word) is True:
if len(word) <= 8:
output.append("{0}.{1}".format(word, line.strip('\n').split('.')[1]))
for name in sorted(output, key=len):
filtered.write(str(name+"\n"))
filtered.close()
结果是:
gold.xyz
iron.efg
copper.abc
silver.abc
nickel.xyz
bronze.abc
platinum.abc
我想:
bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz
我首先获取一个列表并使用数字或短划线过滤出名称,然后将其保存到新文件中。接下来,我尝试对结果列表进行排序,并将其再次保存到新列表中。我对Python并不熟悉,所以很明显而且效率很低。任何提示将不胜感激,提前谢谢!
答案 0 :(得分:1)
您要求排序使用您的长度作为关键:
for name in sorted(output, key=len):
而是使用lambda对您的字典进行排序,该lambda返回一个像这样的元组:
for name in sorted(output, key=lambda k: (k.split('.')[1], k.split('.')[0], len)):
首先根据后缀(例如abc)排序,然后排序前缀(例如铜牌),最后按len排序。输出:
bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz
答案 1 :(得分:1)
完整的优化解决方案:
import csv, re
def multi_sort(s):
parts = s.split('.')
return (parts[1], len(s), parts[0])
with open('metals.csv', 'r') as inp, open('output.txt', 'w', newline='') as out:
reader = csv.DictReader(inp, fieldnames=None) # name,date,suffix - header line
names = []
for l in reader:
if re.search(r'[^0-9-]+\..*', l['name']):
names.append(l['name'])
names.sort(key=multi_sort)
writer = csv.writer(out)
for n in names:
writer.writerow((n,))
output.txt
内容:
bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz