Question

在Python中有什么简单的方法可以去除字符串并获取起始索引和结束索引吗？

示例：给定字符串' hello world! '，我想要剥离字符串'hello world!'以及起始索引2和索引14。

' hello world! '.strip()仅返回剥离的字符串。

我可以写一个函数：

def strip(str):
    '''
    Take a string as input.
    Return the stripped string as well as the start index and end index.
    Example: '  hello world!   '  --> ('hello world!', 2, 14)
    The function isn't computationally efficient as it does more than one pass on the string.
    '''
    str_stripped = str.strip()
    index_start = str.find(str_stripped)
    index_end = index_start + len(str_stripped)
    return str_stripped, index_start, index_end

def main():
    str = '  hello world!   '
    str_stripped, index_start, index_end = strip(str)
    print('index_start: {0}\tindex_end: {1}'.format(index_start, index_end))

if __name__ == "__main__":
    main()

但我想知道Python或一个流行的库是否提供了任何内置的方法。

Answer 1

一个选项（可能不是最直接的选择）就是使用正则表达式：

>>> import re
>>> s = '  hello world!   '
>>> match = re.search(r"^\s*(\S.*?)\s*$", s)
>>> match.group(1), match.start(1), match.end(1)
('hello world!', 2, 14)

在^\s*(\S.*?)\s*$模式中的位置：

^是字符串的开头
\s*零个或多个空格字符
(\S.*?)是一个捕获组，可以 non-greedy方式
$是字符串的结尾

Answer 2

最有效的方法是分别调用lstrip和rstrip。例如：

s = '  hello world!   '
s2 = s.lstrip()
s3 = s2.rstrip()
ix = len(s) - len(s2)
ix2 = len(s3) + ix

这给出了：

>>> s3
'hello world!'
>>> ix
2
>>> ix2
14
>>>

Answer 3

事实上，您拥有完成此任务所需的方法。 <{1}}，strip和find就是您所需要的。

len

剥离字符串并获取起始索引和结束索引

3 个答案: