如何在嵌套列表中使用dateutil?

时间:2016-04-04 09:54:27

标签: python sorting date python-3.x

r = [['21-09-1995', 3], ['22-11-1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]]

有谁知道如何在嵌套列表中使用dateutil? 我试了这个没有成功:

from dateutil.parser import parse
r = sorted(r, key=parse)

错误:'list'对象没有属性'read'

我知道还有其他方法来排序日期,但我喜欢dateutils的是它识别日期而不指示格式。体育21/09/199521-09-1995被视为日期。

预期产出:

r = [['test', 4], ['07-01-1988', 6], ['21-09-1995', 3], ['22-11-1995', 2],  ['12-12-2001', 5]]

r = [['07-01-1988', 6], ['21-09-1995', 3], ['22-11-1995', 2],  ['12-12-2001', 5], ['test', 4]]

2 个答案:

答案 0 :(得分:1)

这有效:

from datetime import datetime
from dateutil.parser import parse

def my_parse(lis):
    try: 
        return parse(lis[0])
    except ValueError:
        return datetime(1, 1, 1)

print(sorted(r, key=my_parse))

输出:

[['test', 4], ['07-01-1988', 6], ['21-09-1995', 3], ['22-11-1995', 2], ['12-12-2001', 5]]

您需要将子列表的第一个条目提供给parse()。条目test不可解析并触发ValueError。抓住它并返回一个日期时间对象,超出预期的日期范围。

使用:

return datetime(9999, 1, 1)

如果您希望输入test作为排序结果中的最后一个。

修改

如果您希望它与平面嵌套列表一起使用,您可以检查该条目是否为字符串:

r = ['test',  '21-09-1995 wednesday', '07-01-1988 tuesday'] 

from datetime import datetime
from dateutil.parser import parse

def my_parse(value):
    try: 
        if isinstance(value, str):
            return parse(value)
        else:
            return parse(value[0])
    except ValueError:
        return datetime(1, 1, 1)

print(sorted(r, key=my_parse))

这假设value是包含字符串或字符串的可迭代。

答案 1 :(得分:1)

您根本不需要dateutil,只需将日期更改为倒置,以便从年开始:

r = [['21-09-1995', 3], ['22-11-1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]]

def srt(x):
    try:
        return int("".join(x[0].split("-")[::-1]))
    except ValueError:
        return 0
r.sort(key=srt)

输出:

[['test', 4], ['07-01-1988', 6], ['21-09-1995', 3], ['22-11-1995', 2], ['12-12-2001', 5]]

如果您不介意将文本字符串排序到最后,那就更简单了:

r.sort(key=lambda x: "".join(x[0].split("-")[::-1]))

那会给你:

['07-01-1988', 6], ['21-09-1995', 3], ['22-11-1995', 2], ['12-12-2001', 5], ['test', 4]]

对于不同的格式:

r = [['21-09-1995', 3], ['22/11/1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]]

import  re

reg = re.compile("[\-/]")
r.sort(key=lambda x: "".join(reg.split(x[0])[::-1]))

输出:

[['07-01-1988', 6], ['21-09-1995', 3], ['22/11/1995', 2], ['12-12-2001', 5], ['test', 4]]

即使使用正则表达式,您也可以看到存在很大差异:

r = [['21-09-1995', 3], ['22/11/1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]] 
r.sort(key=my_parse)
   ...: 
10000 loops, best of 3: 185 µs per loop

In [5]: 

In [5]: %%timeit
r = [['21-09-1995', 3], ['22/11/1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]] 
r.sort(key=lambda x: "".join(reg.split(x[0])[::-1]))
   ...: 
100000 loops, best of 3: 6.56 µs per loop

In [7]: %%timeit
r = [['21-09-1995', 3], ['22/11/1995', 2] , ['07-01-1988', 6], ['test', 4], ['12-12-2001', 5]] 
r.sort(key=regex_srt)
...: 
100000 loops, best of 3: 10.3 µs per loop

如果您有一个平面列表和字符串,例如' 07-01-1988周二':

reg = re.compile("[\-/\s]")
r = ['test', '02/03/2015 test', '02/09/2016 test', '12/11/2011 test', '22/01/2015 test', '22/01/2010 test', '22/01/2013 test']
def srt(x):
    try:
        print(reg.split(x))
        return int("".join(reg.split(x)[:3][::-1]))
    except ValueError:
        return 0
r.sort(key=srt)
print(r)

输出:

['test', '22/01/2010 test', '12/11/2011 test', '22/01/2013 test', '22/01/2015 test', '02/03/2015 test', '02/09/2016 test']