给出一个字符串:
x = 'foo test1 test1 foo test2 foo'
我想通过foo
对字符串进行分区,以便我得到以下内容:
['foo', 'test1 test1 foo', 'test2 foo'] (preferred)
or
[['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo']] (not preferred, but workable)
我试过itertools.groupby
:
In [1209]: [list(v) for _, v in itertools.groupby(x.split(), lambda k: k != 'foo')]
Out[1209]: [['foo'], ['test1', 'test1'], ['foo'], ['test2'], ['foo']]
但它并没有完全给我我正在寻找的东西。我知道我可以使用循环并执行此操作:
In [1210]: l = [[]]
...: for v in x.split():
...: l[-1].append(v)
...: if v == 'foo':
...: l.append([])
...:
In [1211]: l
Out[1211]: [['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo'], []]
但最终留空列表效率不高。有更简单的方法吗?
我想保留分隔符。
答案 0 :(得分:3)
也许不是最漂亮的方法,但简洁明了:
[part + 'foo' for part in g.split('foo')][:-1]
输出:
['foo', ' test1 test1 foo', ' test2 foo']
答案 1 :(得分:3)
您可以在案件中使用str.partition:
def find_foo(x):
result = []
while x:
before, _, x = x.partition("foo")
result.append(before + "foo")
return result
>>> find_foo('foo test1 test1 foo test2 foo')
>>> ['foo', ' test1 test1 foo', ' test2 foo']
答案 2 :(得分:1)
您是否考虑过迭代字符串并使用搜索的起始位置?这通常会比你去的时候更快地切断弦。这可能适合你:
x = 'foo test1 test1 foo test2 foo'
def findall(target, s):
lt =len(target)
ls = len(s)
pos = 0
result = []
while pos < ls:
fpos = s.find(target, pos)+lt
result.append(s[pos:fpos])
pos = fpos
return result
print(findall("foo", x))
答案 3 :(得分:1)
您可以使用正面(?<=)
正则表达式背后的外观,如
In [515]: string = 'foo test1 test1 foo test2 foo'
In [516]: re.split('(?<=foo)\s', string)
Out[516]: ['foo', 'test1 test1 foo', 'test2 foo']
和
In [517]: [x.split() for x in re.split('(?<=foo)\s', string)]
Out[517]: [['foo'], ['test1', 'test1', 'foo'], ['test2', 'foo']]
答案 4 :(得分:0)
试试这个
x = 'foo test1 test1 foo test2 foo'
word = 'foo'
out = []
while word in x:
pos = x.index(word)
l = len(word)
out.append( x[:int(pos)+l])
x = x[int(pos)+l:]
print out
输出
['foo', ' test1 test1 foo', ' test2 foo']