Question

脚本的目的是将一段文本分成句子，然后生成一个随机数以查看句子是否突出显示。当运行下面的代码时，它将剪切所有单独的句子并将其粘贴到文档的末尾。我正在寻找要替换的句子，而不是最后添加。

from docx import Document
import re
from nltk import tokenize
import Funtion
import random
from docx.enum.text import WD_COLOR_INDEX

doc = Document('raw.docx')

rawdata = (Funtion.gettext('raw.docx'))

sen = tokenize.sent_tokenize(rawdata)

senlen = len(sen)

p = doc.add_paragraph()

for indsen in sen:
    rng = random.randint(1,11)
    if rng == 8:
        p.add_run(indsen).font.highlight_color = WD_COLOR_INDEX.YELLOW



doc.save('TTTCH.docx')

Answer 1

因为您打电话，句子被添加在最后

p = doc.add_paragraph()

要将哪个段落添加到文档的结尾，然后您在呼叫

p.add_run()

在前面创建的段落的结尾中添加一个运行。

相反，您需要访问文档中已创建的段落，而不是创建自己的段落。就像发布在https://python-docx.readthedocs.io/en/latest/api/document.html#docx.document.Document.paragraphs上一样，您可以像以下方式访问这些内容：

for paragraph in doc.paragraphs:
    # process paragraph in place

我想您想使用https://python-docx.readthedocs.io/en/latest/api/text.html#paragraph-objects上的信息

特别是：

for paragraph in doc.paragraphs:
    text_being_read = paragraph.text
    # process text
    paragraph.clear()
    paragraph.text = "New stuff"

运行后，程序在末尾添加了很大一部分高亮显示的文本吗？

1 个答案: