Python-docx如何在相同样式的段落后设置空格

时间:2019-07-12 14:16:53

标签: python python-docx

使用python-docx时,我注意到,如果我按顺序将样式应用于每个段落,但是样式为space_after = Pt(12),则Word将不接受space_after设置。我在“段落”选项中注意到,已选中“不要在相同样式的段落之间添加空格”。有什么方法可以解决这个问题,以便应用space_after设置?

我在文本末尾使用了换行符,但是我并不总是希望换行。有时,我可能想要分行或特定大小。

2 个答案:

答案 0 :(得分:2)

我建立了一个文档,其中包含几段相同样式(“普通”)的段落,并包含各种space_after

enter image description here

选择所有段落,然后选中“不要在相同样式的段落之间添加空格”。

现在看起来像这样:

enter image description here

保存并关闭Word中的文档,然后通过docx进行检查:

>>> from docx import Document
>>> document = Document(r'c:\debug\doc1.docx')
>>> for p in document.paragraphs:
...     print(p.paragraph_format.space_after)
...
635000
None
None

因此,显然space_after被保留了,但是在文档中没有被观察到,因为它被复选框选项覆盖。这是由<w:contextualSpacing/>的{​​{1}}元素给出的(我通过检查Docx的<w:pPr>部分来注意到这一点)。

\word\document.xml

您可以像这样从底层XML中删除它们:

<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" w:rsidR="00E00961" w:rsidRDefault="00174F18">
  <w:pPr>   
    <w:spacing w:after="1000"/>   
    <w:contextualSpacing/> 
  </w:pPr>  
  <w:bookmarkStart w:id="0" w:name="_GoBack"/> 
  <w:r>   
    <w:t>hello, world!
    </w:t>  
  </w:r>
</w:p>

在删除这些元素之后,再次打开文档,观察切换开关是否已关闭,并且各段遵循for p in document.paragraphs: p_element = p._element cspacing = p_element.xpath(r'w:pPr/w:contextualSpacing')[0] cspacing.getparent().remove(cspacing)

enter image description here

答案 1 :(得分:1)

我尝试了 David Zemens 的答案,其中包含样式“列表编号”的有序列表,但在我的 docx 文件中仍然选中了“不要在相同样式的段落之间添加空格”。

在检查底层 xml 文件时,我在 word/styles.xml 的“List Number”部分发现了一个“w:contextualSpacing”条目。

我的解决方法是在没有 contextualSpacing 条目的情况下重写 docx 文件(存档):

from io import BytesIO
import re
from zipfile import ZipFile

def remove_contextual_spacing_from_style(docx_path, style_name):
    mod_docx = BytesIO()
    style_byte = style_name.encode('utf-8')
    # load docx file as zip archive
    with ZipFile(docx_path, 'r') as old_docx, ZipFile(mod_docx, 'w') as new_docx:
        # iterate through underlying xml files
        for xml_item in old_docx.infolist():
            with old_docx.open(xml_item) as old_xml:
                content = old_xml.read()
                # search style_name section in word/styles.xml
                if xml_item.filename == 'word/styles.xml':
                    cspace = re.search(b'"%s".*?<\/w:pPr>'%(style_byte), content)
                    # remove contextualSpacing entry
                    if cspace:
                        cspace = cspace.group(0)
                        wo_cspace = cspace.replace(b'<w:contextualSpacing/>', b'')
                        content = re.sub(cspace, wo_cspace, content)
                # store xml file in modified docx archive
                new_docx.writestr(xml_item, content)
    # overwrite old docx file with modified docx
    with open(docx_path, 'wb') as new_docx:
        new_docx.write(mod_docx.getbuffer())

# example usage with style "List Number"
remove_contextual_spacing_from_style('path_to.docx', 'List Number')

使用的来源:

在“用例#3,尝试#3”下 https://medium.com/dev-bits/ultimate-guide-for-working-with-i-o-streams-and-zip-archives-in-python-3-6f3cf96dca50

https://techoverflow.net/2020/11/11/how-to-modify-file-inside-a-zip-file-using-python/