查找,重新格式化并替换字符串中的多个日期

时间:2017-07-17 22:23:59

标签: python

我有一个包含日期的长字符串,并希望更新所有日期的格式。

以下是我所写的以及我无法弄清楚的位的伪代码:

import datetime

current_date_format = "%d/%m/%Y"
new_date_format = "%d/%b/%Y"

def main():
    line = "This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000"
    print(line)
    # Best way to pull out and replace all of the dates?
    # pseudo:
    for each current_date_format in line as date_in_line
        temp_date = fix_date(date_in_line)
        line.replace(date_in_line, temp_date)
    print(line)

def fix_date(date_string=''):
    return datetime.datetime.strptime(date_string, current_date_format).strftime(new_date_format)

在这种情况下,如果要打印:

This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000
This is text dated 01/FEB/2017, and there are a few more dates such as 03/JUL/2017 and 09/JUN/2000

由于

1 个答案:

答案 0 :(得分:3)

首先建议不是一个完整的解决方案,请跳到下面的第一个编辑部分

如果您想通过几种方式调整代码,可以执行此操作。首先将字符串分成几部分:

line = "This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000"
words = line.split()  # by default it splits on whitespace

现在,您可以播放每个输入内容。然后,您可以尝试使用fix_date方法解析日期并重新构建字符串:

updated_line = ''
for word in words:
    try:
        updated_line += fix_date(word) + ' '
    except:
        updated_line += word + ' '
updated_line = updated_line[:-1] # gets rid of the extra trailing space
print(updated_line)

编辑:运行后我意识到这与日期附加的标点符号有问题。我正在再次传球。

以下是一些有效的代码:

import datetime
import re

current_date_format = "%d/%m/%Y"
new_date_format = "%d/%b/%Y"

def main():
    line = "This is text dated 01/02/2017, and there are a few more dates such as 03/07/2017 and 09/06/2000"
    print(line)
    line = re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,line)
    print(line)

def fix_date(rem):
    date_string = rem.group()
    return datetime.datetime.strptime(date_string, current_date_format).strftime(new_date_format)

main()

编辑2:由于正则表达式方法适用于巨大的字符串和小字符串,如果您的文件大小足以一次加载所有内容,您可以一次性完成:

import datetime
import re

current_date_format = "%d/%m/%Y"
new_date_format = "%d/%b/%Y"

def main():
    with open('my_file.txt','r') as f:
        text = f.read()
    with open('my_fixed_file.txt','w') as f:
        f.write(re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,text))

def fix_date(rem):
    date_string = rem.group()
    return datetime.datetime.strptime(date_string, current_date_format).strftime(new_date_format)

main()

通过调整文件读/写部分可以更紧凑:

...
with open('my_file.txt','r') as f:
    with open('my_fixed_file.txt','w') as f2:
        f2.write(re.sub(r'\d{2}/\d{2}/\d{4}',fix_date,f.read()))
...