我想在特定字符出现时拆分字符串(例如:。,!等) 我编写了split函数,它确实拆分了,但删除了那些字符。 当我打电话给函数时:
text = 'The first line leads off, With a gap before the next. Then the poem ends.'
我得到了
['The first line leads off', ' With a gap before the next', ' Then the poem ends']
需要更改什么才能删除字符?所以我会得到这个:
['The first line leads off,', ' With a gap before the next.', ' Then the poem ends.']
def split_on_separators(original, separators):
word_list = [original]
new_list = [ ]
for given in separators:
for word in word_list:
new_list.extend(word.split(given))
word_list = new_list
new_list = list()
return word_list
谢谢。
答案 0 :(得分:1)
或者你可以忘记为此编写自己的函数并使用re.split和zip。当您使用捕获组时,re.split将在结果列表中将分隔符保留为下一个元素。它可以使用两个不同的步骤迭代和zip连接在一起。
import re
mypoem = 'The first line leads off, With a gap before the next. Then the poem ends.'
junk = re.split("(,|\.)", mypoem)
poem_split = [i1 + i2 for i1, i2 in zip(junk[0::2], junk[1::2])]
答案 1 :(得分:0)
def splitOnChars(text, chars):
answer = []
start = 0
for i,char in enumerate(text):
if char in chars:
answer.append(text[start:i+1])
start = i+1
answer.append(text[i+1:])
return answer
输出:
In [41]: text = 'The first line leads off, With a gap before the next. Then the poem ends.'
In [42]: chars = ',.!'
In [43]: splitOnChars(text, chars)
Out[43]:
['The first line leads off,',
' With a gap before the next.',
' Then the poem ends.',
'']
答案 2 :(得分:0)
只需使用正则表达式:
import re
text = 'The first line leads off, With a gap before the next. Then the poem ends.'
print re.findall('.*?[,.!]?', text)
# ['The first line leads off,', ' With a gap before the next.', ' Then the poem ends.']