Question

所以我试图在我的代码中解决一个问题，一旦使用re.split（regex_pattern，str），我的拆分列表中的子字符串会添加一个额外的反斜杠。问题是这样的：

In [63]: str = r'/dir/hello\/hell/dir2/hello\end'

In [64]: regex_pattern = '(hello)'

In [65]: a = re.split(regex_pattern, str)

In [66]: a
Out[66]: ['/dir/', 'hello', '\\/hell/dir2/', 'hello', '\\end']

正如您所看到的，Out [66]将列表显示为具有两个带有'\\'的子串而不是带有'\'的两个子串。我知道这个问题与编译器如何解释反斜杠有关，但最终无法弄清楚为什么会发生这种情况。

我也尝试将我的str变量作为原始字符串，并在我的str变量（最多四个'\\\\'）中添加额外的'\'，其中一个存在，即

In [63]: str = r'/dir/hello\\/hell/dir2/hello\\end'

这仍然提供相同的输出。

我在Ubuntu上使用Python 2.7。很抱歉，如果这是重复的，但我找不到一个问题，其答案适用于我的。

Answer 1

这与re.split无关。 \通常定义转义序列。要使用文字\，您需要加倍：

考虑你原来的字符串：

In [15]: s = r'/dir/hello\/hell/dir2/hello\end'

In [16]: s
Out[16]: '/dir/hello\\/hell/dir2/hello\\end'

In [17]: len(s)
Out[17]: 31

额外\不计入len。它们仅帮助指定\没有定义任何其他转义序列; asides \\也是一个转义序列。

当使用re.split时，为什么在列表中的子字符串中添加了一个额外的反斜杠'\'？

1 个答案: