Question

这是两部分问题：

第1部分

要删除多个空格，段落符号只能为一个。

当前代码：

import re
# Read inputfile
with open('input.txt', 'r') as file :
  inputfile = file.read()

# Replace extras spaces with single space.
#outputfile = re.sub('\s+', ' ', inputfile).strip()
outputfile = ' '.join(inputfile.split(None))

# Write outputfile
with open('output.txt', 'w') as file:
  file.write(outputfile)

第2部分：

删除多余的空格后;我搜索并替换模式错误。

喜欢：'['到'['

Pattern1 = re.sub(' [ ', ' [', inputfile)

会抛出错误：

引发错误，v＃无效表达式错误：正则表达式的意外结束

虽然。这有效...（例如：在连字符之前和之后将单词连接在一起）

Pattern1 = re.sub(' - ', '-', inputfile)

在解决间距问题后，我遇到了很多关于标点符号问题的处理方法。

我不希望模式查看先前模式结果的输出并进一步移动。

是否有更好的方法可以将标点符号周围的空格缩小到恰到好处。

Answer 1

对于第一部分，您可以通过换行符块拆分它，压缩每一行，然后将其加入换行符，如下所示：

import re
text = "\n".join(re.sub(r"\s+", " ", line) for line in re.split("\n+", text))
print(text)

对于第二部分，你需要转义[，因为它是一个正则表达式元字符（用于定义字符类），如下所示：

import re
text = re.sub("\[ ", "[", text)
text = re.sub(" ]", "]", text)
print(text)

请注意，您无需转义]，因为它与[不匹配，因此在此上下文中并不特殊。

Try It Online!

或者对于第二部分，text = text.replace("[ ", "[").replace(" ]", "]")，因为你甚至不需要正则表达式。

间距和图案更换

1 个答案: