我想在一定数量的非空格和非段落字符后面分割文本。
到目前为止,我知道您可以执行此操作以在总共字符数之后拆分字符串
cutOff = 10
splitString = oldString[0:cutOff]
但是我该怎么做,以免在字符计数中不考虑空格?
答案 0 :(得分:1)
您可以进行while
循环。
oldString = "Hello world"
cutOff = 10
i = 0
while i < cutOff and cutOff < len(oldString):
if oldString[i] in [' ', '\n']: cutOff += 1
i += 1
splitString = oldString[:cutOff]
答案 1 :(得分:1)
您可以使用正则表达式。这将返回一个包含两个元素的元组(列表),其中两个输入字符串在所需的位置处断开:
import re
data = """Now is the time
for all good men
to come"""
def break_at_ignoring_whitespace(str, break_at):
m = re.match(r"((\s*\w){%d})(.*)" % break_at, str, re.S)
return (m.group(1), m.group(3)) if m else (str, '')
r = break_at_ignoring_whitespace(data, 14)
print(">>" + r[0] + "<<")
print(">>" + r[1] + "<<")
结果:
>>Now is the time
fo<<
>>r all good men
to come<<