按列表中的子字符串排序 - Python

时间:2017-08-21 09:32:51

标签: python regex sorting

我有一个文本文件,其中包含以这种方式分隔的行:

Action: Add Parameter
Matched Parameter: ctl00_ContentPlaceHolderMain_RadSearchBoxNeId_ClientState 
on [HTTPS] /ConsolePage/ConsolePageWeb.aspx Matched Wildcard: * 

Action: Add Parameter
Matched Parameter: ctl00$ContentPlaceHolderMain$HiddenFieldSelectedFilter on 
[HTTPS] /ConsolePage/ConsolePageWeb.aspx Matched Wildcard: * 

我在Python中编写了一个小脚本,只获得"匹配参数后的字符串:"并将其输出到文件,但结果未正确排序。

剧本:

import re

pattern = "^Matched Parameter: ([^\s]+)"
new_file = []

with open(".\params.txt") as txtFile:
    lines = txtFile.readlines()

for line in lines:
    match = re.search(pattern, line)
    if match:
        new_line = match.group()
       new_line = new_line.split(" ")
    del new_line[0], new_line[0]
    new_line = sorted(new_line)
    print(new_line)

输出:

['ctl00_MainSplitter_ClientState']
['ctl00_RadWindowLicenseAggreemennt_C_RadButtonLicenseAggreemenntCancel_ClientState']
['ctl00_RadWindowLicenseAggreemennt_C_RadButtonLicenseAggreemenntOK_ClientState']
['ctl00_RadWindowLicenseAggreemennt_ClientState']
['ctl00$ScriptManagerMain']
['ctl00_RadStyleSheetManager1_TSSM']
['ctl00_ScriptManagerMain_TSM']
['__VIEWSTATE']
['ctl00_radwindow1_ClientState']
['ctl00_RadButtonLgout_ClientState']
['ctl00_TopPane_ClientState']
['ctl00_RadPanelBarMainMenu_ClientState']
['ctl00_LeftPane_ClientState']
['ctl00_ContentPlaceHolderMain_RadWindowManager1_ClientState']
['ctl00$ContentPlaceHolderMain$RadComboBoxTimeResolution']
['ctl00_ContentPlaceHolderMain_RadComboBoxTimeResolution_ClientState']
['ctl00_ContentPlaceHolderMain_RadSearchBoxNeId']
['ctl00_ContentPlaceHolderMain_RadSearchBoxNeId_ClientState']
['ctl00$ContentPlaceHolderMain$HiddenFieldSelectedFilter']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadButtonAlarmsFilterClose_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxRuleNames_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxSeverity_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxStatus_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxEntityType_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadButtonAlarmsFilterOK_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowFiltersList_C_RadButtonFiltersListClose_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowFiltersList_C_RadListBoxExistingFilters_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowFiltersList_C_RadButtonFiltersListOk_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowFiltersList_C_RadButtonFiltersListEdit_ClientState']

我需要输出按字母顺序排序参数名称子字符串,例如' AlarmsUserFilters'之前' ClientState':

['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadButtonAlarmsFilterClose_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxRuleNames_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxSeverity_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxStatus_ClientState']
['ctl00_ContentPlaceHolderMain_AlarmsUserFilters_RadWindowAlarmsFilter_C_RadListBoxEntityType_ClientState']
['ctl00_MainSplitter_ClientState']

任何帮助如何以最好的方式做到这一点?,我需要尽可能通用,即可能有不同的字符串需要这样排序(' ct100'等等...只是一个例子)

谢谢!

1 个答案:

答案 0 :(得分:1)

正如其他人所指出的那样,问题是你只需要构建一个长度为1的列表,然后对其进行排序,然后在读取整个文件之前立即打印它。我已将您的代码更改为:

import re

pattern = "Matched Parameter: ([^\s]*)"
parameters = []

with open(".\\params.txt") as txtFile:
    for line in txtFile:
        match = re.match(pattern, line)
        if match:
            parameters.append(match.group(1))

for par in sorted(parameters):
    print(par)

现在应该可以正常工作了。这也改变了一些其他的碎片 - match.group(1)只是立即获得匹配的组,这是你的正则表达式中括号()中的位。另外,由于您只想从行的开头进行匹配,因此可以使用re.match。我也直接遍历文件的行,而不是构建行,然后迭代这些行。请注意,您是正确的,因为完全可以在字符串列表中使用sortsorted,因为Python会按字母顺序对它们进行比较,通常称为' ;字典顺序'

不幸的是,我可能很难提供任何有意义的样本输出,而不会花费多少时间重建您的输入,因为我无法访问它。