我是python正则表达式的新手。我想基于定界符分割字符串。但是分隔符也应该作为元素返回。如
Input - "The man.in the. high castle."
Output - ["The", "man", ".", "in", "the", ".", "high", "castle", "."]
或
Input - "The man-in the. high castle!"
Output - ["The", "man", "-", "in", "the", ".", "high", "castle", "!"]
请注意,我不仅试图基于空格,分隔符来分割句子,而且还返回分隔符。这有点类似于nltk.word_tokenize,但是有所不同,因为word_tokenize无法拆分诸如“ man.in”之类的词。
我的定界符集是“。”,“?”,“!”,“,”和空格 预先感谢