使用数字编号对字符串的python列表进行排序

时间:2016-02-02 00:19:08

标签: python string list sorting numerical

我有一个名为 filelist

的文件名列表
 In []: filelist
Out []: ['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx',
         'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx',
         'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx']

我想按粗体位置的数值对此文件列表进行排序

  

C:\ Mon20412 \ P-2NODE-RAID6- 1 BLACK-32k-100-segmented.xlsx
  C:\ Mon25312 \ P-2NODE-RAID6- 13 RED-32k-100-segmented.xlsx
  C:\ Mon20362 \ P-2NODE-RAID6-的 2 GREEN-32K-100-segmented.xlsx

所以在这个例子中,输出将是

Out []: ['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx',
         'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx'
         'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx']

谢谢!

3 个答案:

答案 0 :(得分:2)

import re

f = lambda s: int(re.findall(r'.*RAID6-(\d+).*', s)[0])
sorted(l, key=f)

答案 1 :(得分:1)

找一个好的,可靠的方法来提取你想要的数字。然后使用key参数按该数字排序。这似乎对您的输入足够可靠,但效率不高。

a = ['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx',
    'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx',
    'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx']

def k(a):
    x = a.split("\\")[-1].split("-")[3]
    y = filter(lambda x: x in "0123456789", x)
    return int("".join(list(y)))


print(sorted(a, key=k))

输出:

['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx', 
'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx',
'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx']

答案 2 :(得分:1)

使用正则表达式解析数字并将其用作排序键。

又快又脏:

import re

l = ['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx',
     'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx',
     'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx']

def get_sort_number(s):
    pattern = r'C:\\Mon\d+\\P-2NODE-RAID6-(\d+)'

    try:
        return int(re.match(pattern, s).group(1))
    except AttributeError:
        return 0

sorted(l, key=get_sort_number)

这给出了

['C:\\Mon20412\\P-2NODE-RAID6-1BLACK-32k-100-segmented.xlsx',
 'C:\\Mon20362\\P-2NODE-RAID6-2GREEN-32k-100-segmented.xlsx',
 'C:\\Mon25312\\P-2NODE-RAID6-13RED-32k-100-segmented.xlsx']

正则表达式无法匹配的所有字符串都位于排序列表的开头。