我尝试使用正则表达式组更改一堆文件名,但似乎无法使其工作(尽管写了regexr.com告诉我应该是一个有效的正则表达式语句)。我目前拥有的93,000个文件都是这样的:
Mr. McCONNELL.2012-07-31.2014sep19_at_182325.txt
Mrs. HAGAN.2012-12-06.2014sep19_at_182321.txt
Ms. MURRAY.2012-06-18.2014sep19_at_182246.tx
我希望他们看起来像这样:
20120731McCONNELL2014sep19_at_182325.txt
但每次我运行下面的脚本时,都会收到以下错误:
Traceback (most recent call last):
File "changefilenames.py", line 11, in <module>
date = m.group(2)
AttributeError: 'NoneType' object has no attribute 'group'
非常感谢你的帮助。如果这是一个愚蠢的问题,我道歉。我刚开始使用RegEx和Python,似乎无法解决这个问题。
import os
import re
from dateutil.parser import parse
for filename in os.listdir("."):
if filename.startswith("Mr."):
m = re.match("Mr.\s(\w*).(\d*-\d*-\d*).(\w*).txt", filename)
date = m.group(2)
name = m.group(1)
timestamp = m.group(3)
dt = parse(date)
new_filename = "{dt.year}{dt.month}{dt.day}".format(dt=dt) + name + timestamp + ".txt"
os.rename(filename, new_filename)
print new_filename
print "All done with the Mr"
if filename.startswith("Mrs."):
m = re.match("Ms.\s(\w*).(\d*-\d*-\d*).(\w*).txt", filename)
date = m.group(2)
name = m.group(1)
timestamp = m.group(3)
dt = parse(date)
new_filename = "{dt.year}{dt.month}{dt.day}".format(dt=dt) + name + timestamp + ".txt"
os.rename(filename, new_filename)
print new_filename
print "All done with the Mrs"
if filename.startswith("Ms."):
m = re.match("Mrs.\s(\w*).(\d*-\d*-\d*).(\w*).txt", filename)
date = m.group(2)
name = m.group(1)
timestamp = m.group(3)
dt = parse(date)
new_filename = "{dt.year}{dt.month}{dt.day}".format(dt=dt) + name + timestamp + ".txt"
os.rename(filename, new_filename)
print new_filename
print "All done with the Mrs"
EDIT 我根据以下建议更改了脚本,但仍然得到完全相同的错误。这是新脚本:
for filename in os.listdir("."):
m = re.search("(Mr|Mrs|Ms)\.\s(\w*)\.(\d*\-\d*\-\d*)\.(\w*)\.txt", filename)
date = m.group(2)
name = m.group(1)
timestamp = m.group(3)
dt = parse(date)
new_filename = "{dt.year}{dt.month}{dt.day}".format(dt=dt) + name + timestamp + ".txt"
os.rename(filename, new_filename)
print new_filename
答案 0 :(得分:0)
您必须使用re.search
代替re.match
,有关详细信息,请阅读search() vs. match()
:
>>> s="Mr. McCONNELL.2012-07-31.2014sep19_at_182325.txt "
>>> import re
>>> m = re.search("Mr.\s(\w*).(\d*-\d*-\d*).(\w*).txt", s)
>>> date = m.group(2)
>>> date
'2012-07-31'
>>> name = m.group(1)
>>> name
'McCONNELL'
>>> timestamp = m.group(3)
>>> timestamp
'2014sep19_at_182325'
答案 1 :(得分:0)
以下是我对正则表达式的建议。
对数字进行分组,以便稍后按组检索。
(Mr|Mrs|Ms)\.\s(\w*)\.(\d*)\-(\d*)\-(\d*)\.(\w*)\.txt
答案 2 :(得分:0)
re.sub(r'^Mrs?\. (\w+)\.(\d{4})-(\d{2})-(\d{2})\.(\d{4}\w+\d+_at_\d+)(\.txt)$',r'\2\3\4\1\5\6','Mr. McCONNELL.2012-07-31.2014sep19_at_182325.txt')
答案 3 :(得分:0)
我做了这样的改造(免责声明,我根本没有清理过这个):
import re
from pprint import pprint
names = """
Mr. McCONNELL.2012-07-31.2014sep19_at_182325.txt
Mrs. HAGAN.2012-12-06.2014sep19_at_182321.txt
Ms. MURRAY.2012-06-18.2014sep19_at_182246.txt
""".strip()
for record in names.splitlines():
name, part2 = re.split('\.(?=\d)', record, 1)
date, at_time, fileext = re.split('\.', part2)
pprint(record)
pprint(''.join([
date.replace('-', ''),
name.translate(None, ' .',),
at_time,
]) + '.' + fileext)
print('\n')