Question

考虑字符串s：

s = ';hello@;earth@;hello@;mars@'

我想要一个模式pat，以便我得到

re.split(pat, s)

[';hello@', ';earth@', ';hello@', ';mars@']

我希望;和@保留在结果字符串中，但我知道我想将它们分开。

我以为我可以使用前瞻和后视：

re.split('(?<=@)(?=;)', s)

但是，它导致了一个错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-392-27c8b02c2477> in <module>()
----> 1 re.split('(?<=@)(?=;)', s)

//anaconda/envs/3.6/lib/python3.6/re.py in split(pattern, string, maxsplit, flags)
    210     and the remainder of the string is returned as the final element
    211     of the list."""
--> 212     return _compile(pattern, flags).split(string, maxsplit)
    213 
    214 def findall(pattern, string, flags=0):

ValueError: split() requires a non-empty pattern match.

Answer 1

错误消息非常有说服力：re.split()需要非空模式匹配。

请注意，split永远不会在空模式匹配上拆分字符串。

您可以匹配它们：

re.findall(r';\w+@', s)

或

re.findall(r';[^@]+@', s)

请参阅regex demo

re.findall将找到匹配模式的所有非重叠事件。

;[^@]+@模式会找到;后面跟@以外的1 +个符号，然后匹配@，因此;和{{1}将在返回的项目中。

Answer 2

re模块不允许拆分空匹配。您可以使用此模式的regex module来执行此操作：

regex.split(r'(?V1)(?<=@)(?=;)', s)

(?V1)修饰符切换到新行为。

要获得相同的结果，您可以使用re.findall这种模式：

re.findall(r'(?:;|^)[^@]*@*', s)

我需要使用什么模式来分割字符？

2 个答案: