我正在尝试用Python中的正则表达式将日期/时间替换为*符号。挑战在HTML源代码中有一个
字符。我不知道如何用Python来抓住它
我的HTML源代码
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="nl" lang="nl">
<body leftmargin="15" marginwidth="0" marginheight="0">
<table summary="" width="97%" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td colspan="3">U heeft gezocht met</td>
</tr>
<tr>
<td width="20%">Postcode:</td>
<td colspan="2">9999 ZZ</td>
</tr>
<tr><td>Huisnummer:</td>
<td colspan="2">1</td>
</tr>
<tr>
<td colspan="2"> </td>
<td class="r">20-11-2017 11:51:01</td>
</tr>
</tbody>
</table>
</body>
</html>
我的python正则表达式代码
def _fixhtml(self, filename):
regex_datetime = r'<div align="right">\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}</div>'
subst_datetime = '<div align="right">**-**-**** **:**</div>'
regex_datetime1 = r'<td class="r">\d{1,2}-\d{1,2}-\d{4}\xa0\s+\d{1,2}:\d{2}:\d{2}</td>'
subst_datetime1 = '<td class="r">**-**-**** **:**:**</td>'
out_fname = filename + ".tmp"
with open(filename) as f:
out = open(out_fname, "w")
for line in f:
line = re.sub(regex_datetime, subst_datetime, line)
line = re.sub(regex_datetime1, subst_datetime1, line)
out.write(line)
out.close()
os.remove(filename)
os.rename(out_fname, filename)
我尝试了多个组合,比如\ S \ s +和我找到的最后一个组合'捕获'
字符,但它不匹配。