Question

我知道如何使用re在这个问题中基于多个分隔符拆分字符串：Split Strings with Multiple Delimiters?。但我想知道如何使用分隔符列表中给出的顺序拆分字符串，其中每个分割只会发生一次。

multiple_sep_split("hello^goo^dbye:cat@dog", ['^',':','@'])
>>> ['hello', 'goo^dbye', 'cat', 'dog']  #(note the extra carat)
multiple_sep_split("my_cat:my_dog:my:bird_my_python",[':',':','_'])
>>> ['my_cat','my_dog','my:bird','my_python']

一种方法可能是不匹配分隔符，而是匹配分隔符之间的文本，并将它们作为组返回，但还有另一种方法吗？

text_re = re.compile('(.+)^(.+):(.+)@(.+)') # get each group from here

Answer 1

如果我理解你在问什么，你只需要一系列字符串partition操作：第一个分隔符上的第一个partition，然后是第二个，等等。

这是一个递归方法（不使用re）：

def splits(s,seps):
    l,_,r = s.partition(seps[0])
    if len(seps) == 1:
        return [l,r]
    return [l] + splits(r,seps[1:])

演示：

a = 'hello^goo^dbye:cat@dog'

splits(a,['^',':','@'])
Out[7]: ['hello', 'goo^dbye', 'cat', 'dog']

Answer 2

我相信你的问题严重不足，但至少在你给出的例子中给出了你想要的结果：

def split_at_most_once_each_and_in_order(s, seps):
    result = []
    start = 0
    for sep in seps:
        i = s.find(sep, start)
        if i >= 0:
            result.append(s[start: i])
            start = i+1
    if start < len(s):
        result.append(s[start:])
    return result

print split_at_most_once_each_and_in_order(
    "hello^goo^dbye:cat@dog", "^:@")

返回['hello', 'goo^dbye', 'cat', 'dog']。如果你绝对想要“聪明”，继续寻找; - ）

是否可以按顺序在多个分隔符上拆分字符串？

2 个答案: