Python - 正则表达式,用于查找列表中的日期

时间:2015-07-22 14:20:12

标签: python mysql regex

在循环通过csv时,我尝试规范化MySQL负载的日期(yyyy-mm-dd)。我试图搜索包含两个正斜杠的项目,但我发现它不够独特,无法识别日期。这可以通过正则表达式完成并查找与模式匹配的项目吗?任何意见都将不胜感激。

示例输入:

['1','2','01/02/2015','3','4','1-05-2015','5','Anot/her Ex/ample','6']

示例输出:

['1','2','2015-01-02','3','4','2015-01-05','5','Anot/her Ex/ample','6']

4 个答案:

答案 0 :(得分:2)

re.sub(r"(\d+)/(\d+)/(\d+)",r"\3-\1-\2",test_Str)

这应该为你做。

x= ['1','2','01/02/2015','3','4','1-05-2015','5','Anot/her Ex/ample','6']
print [re.sub(r"(\d+)/(\d+)/(\d+)",r"\3-\1-\2",i) for i in x ]

答案 1 :(得分:0)

如果您知道字段的顺序。您可以使用datetime.strptime:

from datetime import datetime

l =  ['1','2','01/02/2015','3','4','1-05-2015','5','Another Ex/ample','6']

out = []

for ele in l:
    try:
        out.append(datetime.strptime(ele,"%d/%m/%Y").strftime("%Y-%m-%d"))    
    except ValueError:
        out.append(ele)

print(out)

我不知道您希望'1-05-2015'成为'2015-01-05',因为您只考虑日期的正斜杠:

如果您要测试多种模式:

out = []

for ele in l:
    for patt in ["%d/%m/%Y","%d-%m-%Y"]:
        try:
            p1 = datetime.strptime(ele,patt).strftime("%Y-%m-%d")
            if p1:
                out.append(p1)
                break
        except ValueError as e:
            print(e)
    else:
        out.append(ele)

print(out)
['1', '2', '2015-01-02', '3', '4', '2015-01-05', '5', 'Anot/her Ex/ample', '6']

您还可以过滤长度,只尝试解析正确的长度字符串:

for ele in l:
    ln = len(ele)
    if 7 <= ln > 10:
        out.append(ele)
        continue
    for patt in ["%d/%m/%Y", "%d-%m-%Y"]:
        try:
            p1 = datetime.strptime(ele,patt).strftime("%Y-%m-%d")
            if p1:
                out.append(p1)
                break
        except ValueError as e:
            print(e)
    else:
        out.append(ele)

正则表达式可能会比日期更多匹配,所以除非你百分之百肯定,否则你应该至少尝试在添加之前将正则表达式返回到日期时间对象。

答案 2 :(得分:0)

我觉得@ PadraicCunningham的解决方案看起来最具弹性,可以轻松扩展如下,以迎合其他情况:

from datetime import datetime

l =  ['1','2','01/02/2015','3','4','1-05-2015','5','Another Ex/ample','6']

out = []

for ele in l:

    try:
        out.append(datetime.strptime(ele,"%m/%d/%Y").strftime("%Y-%m-%d")) 
        continue
    except ValueError:
        pass

    try:
        out.append(datetime.strptime(ele,"%m-%d-%Y").strftime("%Y-%m-%d")) 
    except ValueError:
        out.append(ele)

print(out)

现在打印出来:

['1', '2', '2015-01-02', '3', '4', '2015-01-05', '5', 'Another Ex/ample', '6']

您还应该考虑以下测试用例。这些将导致不做任何改变。

l = ['40/05/2015', '13/01/2000', '04/31/2001']

答案 3 :(得分:0)

您可以尝试以下代码:

element