我目前正在尝试对表单列表进行排序:
[["Chr1", "949699", "949700"],["Chr11", "3219", "444949"],
["Chr10", "699", "800"],["Chr2", "232342", "235345234"],
["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]]
使用内置sorted()
,我得到:
[['Chr1','949699','949700'],['Chr1','950000','960000'],['Chr10','699','800'],['Chr11' ,'3219','444949',['Chr2','232342',''235345234'],['ChrX','4567','45634']]
但我希望“Chr2”在“Chr10”之前出现。我目前的解决方案涉及从页面改编的一些代码:Does Python have a built in function for string natural sort?
我目前的解决方案如下:
import re
def naturalSort(l):
convert= lambda text: int(text) if text.isdigit() else text.lower()
alphanum_key= lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
if isinstance(l[0], list):
return sorted(l, key= lambda k: [alphanum_key(x) for x in k])
else:
return sorted(l, key= alphanum_key)
产生正确的顺序:
[['Chr1', '949699', '949700'], ['Chr1', '950000', '960000'], ['Chr2', '232342', '235345234'], ['Chr10', '699', '800'], ['Chr11', '3219', '444949'], ['ChrX', '4567', '45634']]
有更好的方法吗?
答案 0 :(得分:0)
是不是喜欢:
In [1]: l = [["Chr1", "949699", "949700"],["Chr11", "3219", "444949"],["Chr10", "699", "800"],["Chr2", "232342", "235345234"],["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]]
In [2]: sorted(l, key=lambda x: int(x[0].replace('Chr', '')) if x[0].replace('Chr', '').isdigit() else x[0])
Out[2]:
[['Chr1', '949699', '949700'],
['Chr1', '950000', '960000'],
['Chr2', '232342', '235345234'],
['Chr10', '699', '800'],
['Chr11', '3219', '444949'],
['ChrX', '4567', '45634']]
或更优雅的变体:
sorted(l, key=lambda x: int(''.join([i for i in x[0] if i.isdigit()])) if re.findall(r'\d+$', x[0]) else x[0])
答案 1 :(得分:0)
这是一个更紧凑的解决方案:
natkey = lambda e: [x or int(y) for x, y in re.findall(r'(\D+)|(\d+)', e)]
print sorted(data, key=lambda item: map(natkey, item))