Question

我有以下字符串：

It reported the proportion of the edits made from America was 51% for the Wikipedia, and 25% for the simple Wikipedia.[142] The Wikimedia Foundation hopes to increase the number in the Global South to 37% by 2015.[143]

我正在尝试将.[xxx]之类的所有字符替换为.[xxx] \n；

x是数字

我正在从不同的茎溢出答案中寻求帮助；其中之一是：

Python insert a line break in a string after character "X"

Regex: match fullstop and one word in python

import re
str = "It reported the proportion of the edits made from America was 51% 
for the Wikipedia, and 25% for the simple Wikipedia.[142] The Wikimedia 
Foundation hopes to increase the number in the Global South to 37% by 
2015.[143] "
x = re.sub("\.\[[0-9]{2,5}\]\s", "\.\[[0-9]{2,5}\]\s\n",str)
print(x)

我期望以下输出：

It reported the proportion of the edits made from America was 51% for the Wikipedia, and 25% for the simple Wikipedia.[142]                          
The Wikimedia Foundation hopes to increase the number in the Global South to 37% by 2015.[143]”

但是我得到了：

It reported the proportion of the edits made from America was 51% for the Wikipedia, and 25% for the simple Wikipedia\\.\[[0-9]{2,5}\]\s   The Wikimedia Foundation hopes to increase the number in the Global South to 37% by 2015\\.\[[0-9]{2,5}\]\s

Answer 1

您可能想在re.sub中使用捕获组和反向引用。您也不需要转义替换字符串（ regex101 ）：

import re
s = '''It reported the proportion of the edits made from America was 51% for the Wikipedia, and 25% for the simple Wikipedia.[142] The Wikimedia Foundation hopes to increase the number in the Global South to 37% by 2015.[143] '''
x = re.sub(r'\.\[([0-9]{2,5})\]\s', r'.[\1] \n', s)
print(x)

打印：

It reported the proportion of the edits made from America was 51% for the Wikipedia, and 25% for the simple Wikipedia.[142] 
The Wikimedia Foundation hopes to increase the number in the Global South to 37% by 2015.[143]

Answer 2

您可以使用

(\.\[[^][]*\])\s*

并将其替换为\1\n，请参见a demo on regex101.com。

这是

(
    \.\[   # ".[" literally
    [^][]* # neither "[" nor "]" 0+ times
    \]     # "]" literally
)\s*       # consume whitespaces, eventually

Answer 3

使用findall（）识别匹配模式的列表。然后，您可以将其替换为原始字符串+'\ n'

如何在python的字符串中的每个字符后添加换行符，例如“。[xxx]”

3 个答案: