我正在寻找python代码来突出显示word文档中格式为"MM-DD-YYYY"
的所有日期值。我正在使用单词docx
来执行此操作。下面是我的代码,但它突出显示完整的行而不是唯一的日期。
for p in doc.paragraphs:
date1 = re.findall(r"[0-9]{2}-[0-9]{2}-(?!0000)[0-9]{4}", p.text)
for run in p.runs:
if date1:
run.font.highlight_color = WD_COLOR_INDEX.YELLOW
答案 0 :(得分:2)
对于您的脚本,只需将颜色添加到运行中必须单独使用的日期。
考虑这个例子:
from docx import Document
from docx.shared import Inches
from docx.enum.text import WD_COLOR_INDEX
document = Document()
document.add_heading('Document Title', 0)
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' 05-03-2018 ')
p.add_run('italic.').italic = True
document.save('demo.docx')
然后:
doc = Document('demo.docx')
for p in doc.paragraphs:
for run in p.runs:
date1 = re.findall(r"[0-9]{2}-[0-9]{2}-(?!0000)[0-9]{4}", run.text)
if date1:
run.font.highlight_color = WD_COLOR_INDEX.YELLOW
doc.save('demo.docx')
更新:也许这可以帮到你。它将在段落文本中搜索字符串。如果找到它将保存运行到列表,删除段落文本,然后重建运行。内部匹配的运行使用特殊语法重建,以为日期着色。
from docx import Document
from docx.shared import Inches
from docx.enum.text import WD_COLOR_INDEX
document = Document()
document.add_heading('Document Title', 0)
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold text. ').bold = True
p.add_run('Current date: 05-03-2018 ')
p.add_run('italic.').italic = True
document.save('demo.docx')
doc = Document('demo.docx')
pattern = r"[0-9]{2}-[0-9]{2}-(?!0000)[0-9]{4}"
for p in doc.paragraphs:
if re.findall(pattern, p.text):
runs = list(p.runs)
p.text = ''
for run in runs:
match = re.search(pattern, run.text)
if not match:
newrun = p.add_run(run.text)
if run.bold:
newrun.bold = True
if run.italic:
newrun.italic = True
else:
start, end = match.span()
p.add_run(run.text[0:start])
colored = p.add_run(run.text[start:end])
colored.font.highlight_color = WD_COLOR_INDEX.YELLOW
p.add_run(run.text[end:len(run.text)+1])
doc.save('demo.docx')