Question

文本文件＆＃34; input_msg.txt＆＃34;文件包含以下记录..

1月1日02:32:40其他字符串但在所有这些行中可能有也可能不唯一 1月1日02:32:40其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 2月1日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的 2月1日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的 1月1日02:46:40其他字符串但在所有这些行中可能有也可能不是唯一的 1月1日02:44:40其他字符串但在所有这些行中可能有也可能不是唯一的 1月1日02:40:40其他字符串但在所有这些行中可能有也可能不是唯一的 2月10日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的

我尝试过以下程序。

def sort_file_based_timestap():    
   f = open(r"D:\Python34\test_msg.txt", "r")    
   xs = f.readlines()     
   xs.sort()  
   print (xs)
   f.close()

此程序基于字符串进行排序。

我需要输出如下。

1月1日02:32:40其他字符串但在所有这些行中可能有也可能不唯一 1月1日02:32:40其他字符串但在所有这些行中可能是也可能不是唯一的 1月1日02:40:40其他字符串但在所有这些行中可能有也可能不是唯一的 1月1日02:44:40其他字符串但在所有这些行中可能有也可能不是唯一的 1月1日02:46:40其他字符串但在所有这些行中可能有也可能不是唯一的 2月1日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的 2月1日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的 2月10日03:52:26其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:55其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:56其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能是也可能不是唯一的 3月31日23:31:57其他字符串但在所有这些行中可能有也可能不是唯一的

您的帮助将不胜感激!!!

Answer 1

诀窍是首先用python可读的时间戳注释每一行，然后对这个注释行列表进行排序。

我在下面放了一些示例代码：

import time
import re

def parse_line(line):
    """
    Parses each line to split line into the timestamp and the rest
    """

    line = line.rstrip()
    m = re.match(r"(\w{3}\s+\d+\s+[0-9:]+)\s+(.*)", line)
    if m:
        timestamp = time.strptime(m.group(1), "%b %d %H:%M:%S")
        return (timestamp, line)


def main():
    f = open('input_msg.txt', 'r')
    lines = []
    for line in f:
        parsed = parse_line(line)
        if parsed:
            lines.append(parsed)
    # sort the array based on the first element of each tuple
    # which is the parsed time
    sorted_lines  = sorted(lines, key=lambda annotated_line: annotated_line[0])
    for l in sorted_lines:
        print l[1]

if __name__ == "__main__":
    main()

Answer 2

使用（月，日，休息）三元组作为排序键，正确解析月和日，从而正确比较。

import time
def dater(line):
    month, day, rest = line.split(' ', 2)
    return (time.strptime(month, '%b'), int(day), rest)

with open('input_msg.txt') as file:
    for line in sorted(file, key=dater):
        print(line, end='')

Answer 3

这个怎么样？

首先获取文本，然后使用splitlines（）将其转换为列表现在，该列表的每个条目都是一个字符串。我们无法对这些字符串进行排序。因此，接下来，您将字符串提取并使用split（）将其转换为列表现在，您的日志文件已转换为列表列表现在，您可以使用自定义键功能解析此“列表列表”。

这是执行此操作的代码-

# log text
log = """Jan 1 02:32:40 other strings but may or may not unique in all those lines
    Jan 1 02:32:40 other strings but may or may not unique in all those lines
    Mar 31 23:31:55 other strings but may or may not unique in all those lines
    Mar 31 23:31:55 other strings but may or may not unique in all those lines
    Mar 31 23:31:55 other strings but may or may not unique in all those lines
    Mar 31 23:31:56 other strings but may or may not unique in all those lines
    Mar 31 23:31:56 other strings but may or may not unique in all those lines
    Mar 31 23:31:56 other strings but may or may not unique in all those lines
    Mar 31 23:31:57 other strings but may or may not unique in all those lines
    Mar 31 23:31:57 other strings but may or may not unique in all those lines
    Mar 31 23:31:57 other strings but may or may not unique in all those lines
    Mar 31 23:31:57 other strings but may or may not unique in all those lines
    Feb 1 03:52:26 other strings but may or may not unique in all those lines
    Feb 1 03:52:26 other strings but may or may not unique in all those lines
    Jan 1 02:46:40 other strings but may or may not unique in all those lines
    Jan 1 02:44:40 other strings but may or may not unique in all those lines
    Jan 1 02:40:40 other strings but may or may not unique in all those lines
    Feb 10 03:52:26 other strings but may or may not unique in all those lines"""

# convert the log into a list of strings
lines = log.splitlines()
'''initialize temp list that will store the log as a "list of lists" which can be sorted easily'''
temp_list = []
for data in lines:
    temp_list.append(data.split())


# writing the method which will be fed as a key for sorting
def convert_time(logline):
    # extracting hour, minute and second from each log entry
    h, m, s = map(int, logline[2].split(':'))
    time_in_seconds = h * 3600 + m * 60 + s
    return time_in_seconds


sorted_log_list = sorted(temp_list, key=convert_time)

''' sorted_log_list is a "list of lists". Each list within it is a representation of one log entry. We will use print and join to print it out as a readable log entry'''
for lines in sorted_log_list:
    print " ".join(lines)

这是上面代码的更有效的版本，在这里我们不需要创建temp_list并只需编写一个对由splitlines（）结果生成的字符串起作用的函数

# convert the log into a list of strings
lines = log.splitlines()

# writing the method which will be fed as a key for sorting
def convert_time(logline):
    # extracting hour, minute and second from each log entry
    h, m, s = map(int, logline.split()[2].split(':'))
    time_in_seconds = h * 3600 + m * 60 + s
    return time_in_seconds


sorted_log_list = sorted(lines, key=convert_time)

''' sorted_log_list is a "list of lists". Each list within it is a representation of one log entry. We will use print and join to print it out as a readable log entry'''
for lines in sorted_log_list:
    print lines

如何根据日期时间

3 个答案: