对包含数据的字符串列表进行排序

时间:2016-10-13 16:26:57

标签: python sorting

所以我有一个与kibana索引相关的字符串列表,字符串如下所示:

λ curl '10.10.43.210:9200/_cat/indices?v'
health status index               pri rep docs.count docs.deleted store.size pri.store.size
yellow open   filebeat-2016.10.08   5   1        899            0    913.8kb        913.8kb
yellow open   filebeat-2016.10.12   5   1        902            0    763.9kb        763.9kb
yellow open   filebeat-2016.10.13   5   1        816            0    588.9kb        588.9kb
yellow open   filebeat-2016.10.10   5   1        926            0    684.1kb        684.1kb
yellow open   filebeat-2016.10.11   5   1        876            0    615.2kb        615.2kb
yellow open   filebeat-2016.10.09   5   1        745            0    610.7kb        610.7kb

日期回来未分类。我想按索引(这是一个日期)排序这些文件 - 2016-10.xx ASC或DESC没问题。

现在我就像这样隔离字符串:

    subp = subprocess.Popen(['curl','-XGET' ,'-H', '"Content-Type: application/json"', '10.10.43.210:9200/_cat/indices?v'], stdout=subproce$
    curlstdout, curlstderr = subp.communicate()
    op = str(curlstdout)
    kibanaIndices = op.splitlines()
    for index,elem in enumerate(kibanaIndices):
            if "kibana" not in kibanaIndices[index]:
                    print kibanaIndices[index]+"\n"
                    kibanaIndexList.append(kibanaIndices[index])

但不能以有意义的方式对它们进行排序。

2 个答案:

答案 0 :(得分:1)

这是你需要的吗?

lines = """yellow open   filebeat-2016.10.08   5   1        899            0    913.8kb        913.8kb
yellow open   filebeat-2016.10.12   5   1        902            0    763.9kb        763.9kb
yellow open   filebeat-2016.10.13   5   1        816            0    588.9kb        588.9kb
yellow open   filebeat-2016.10.10   5   1        926            0    684.1kb        684.1kb
yellow open   filebeat-2016.10.11   5   1        876            0    615.2kb        615.2kb
yellow open   filebeat-2016.10.09   5   1        745            0    610.7kb        610.7kb
""".splitlines()
def extract_date(line):
    return line.split()[2]
lines.sort(key=extract_date)
print("\n".join(lines))

此处extract_date是一个返回第三列的函数(如filebeat-2016.10.12)。我们将此函数用作key的{​​{1}}参数,以将此值用作排序键。日期格式可以按字符串排序。您可以使用更复杂的sort函数来仅提取日期。

答案 1 :(得分:0)

我将您的示例数据复制为UTF-8文本文件,因为我无法访问您引用的服务器。使用列表推导和字符串方法,您可以清理数据,然后将其分解为组件部分。通过将lambda函数作为参数传递给builtin sorted()方法来完成排序:

# read text data into list one line at a time
result_lines = open('kibana_data.txt').readlines()

# remove trailing newline characters
clean_lines = [line.replace("\n", "") for line in result_lines]

# get first line of file
info = clean_lines[0]

# create list of field names
header = [val.replace(" ", "") 
          for val in clean_lines[1].split()]

# create list of lists for data rows
data = [[val.replace(" ", "") for val in line.split()] 
        for line in clean_lines[2:]]

# sort data rows by date (third row item at index 2)
final = sorted(data, key=lambda row: row[2], reverse=False)

有关列表推导的更多信息:https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions

有关排序的更多信息:https://wiki.python.org/moin/HowTo/Sorting