我有一个字符串:
"""Hello. It's good to meet you.
My name is Bob."""
我正在尝试找到将其拆分为按句点和换行符划分的列表的最佳方法:
["Hello", "It's good to meet you", "My name is Bob"]
我很确定我应该使用正则表达式,但是,由于没有经验,我很难弄清楚如何做到这一点。
答案 0 :(得分:20)
您不需要正则表达式。
>>> txt = """Hello. It's good to meet you.
... My name is Bob."""
>>> txt.split('.')
['Hello', " It's good to meet you", '\nMy name is Bob', '']
>>> [x for x in map(str.strip, txt.split('.')) if x]
['Hello', "It's good to meet you", 'My name is Bob']
答案 1 :(得分:2)
对于您的示例,分割点,可选地后跟空格(并忽略空结果)就足够了:
>>> s = """Hello. It's good to meet you.
... My name is Bob."""
>>> import re
>>> re.split(r"\.\s*", s)
['Hello', "It's good to meet you", 'My name is Bob', '']
在现实生活中,你必须处理Mr. Orange
,Dr. Greene
和George W. Bush
,但是......
答案 2 :(得分:1)
>>> s = """Hello. It's good to meet you.
... My name is Bob."""
>>> import re
>>> p = re.compile(r'[^\s\.][^\.\n]+')
>>> p.findall(s)
['Hello', "It's good to meet you", 'My name is Bob']
>>> s = "Hello. #It's good to meet you # .'"
>>> p.findall(s)
['Hello', "#It's good to meet you # "]
答案 3 :(得分:0)
您可以使用此拆分
re.split(r"(?<!^)\s*[.\n]+\s*(?!$)", s)
答案 4 :(得分:0)
矿:
re.findall('(?=\S)[^.\n]+(?<=\S)',su)