Question

我想创建一个/（我的第一个）脚本，用Word文档中的不间断空格（\ u00A0）替换字符后面的空格，然后更改文本并保存更改的文档。

tldr 问：为什么＆＃39; p＆＃39;在for循环中评估为＆＃39; space＆＃39; +字符+＆＃39;空格＆＃39;而不是不间断的空间？

#Replace a space behind an unwanted expressions with a non-breaking space from a Word document to a new Word document.
import docx, re, sys

#get document name from command line
#if len(sys.argv) > 1:
    #name = ' '.join(sys.argv[1:])
#doc = docx.Document(name + '.docx')
doc = docx.Document('Kajla.docx')  #used this particular file for testing, will delete this line afterwards

#regex of unwanted expressions
regex = re.compile(r'''
(\s)                  #space
([aivkszuAIVKSZU])    #unwanted char
\s                    #space
''', re.VERBOSE)

#goes through each paragraph, replaces a match and saves it
for paragraph in range(len(doc.paragraphs)):
    #keeps the space and the unwated character but replaces the last space
    p = regex.sub(r'\1\2'+'\u00A0', doc.paragraphs[paragraph].text)
    #doc.paragraphs[paragraph].text = p #commented out since it wasnt working
    #print(p)      #for testing, will delete

#saves document as a copy
#doc.save(name + '2.docx')
doc.save('Kajla2.docx')

将带有.sub（）的\ u00A0替换为Word文档

0 个答案: