Question

我在文本文件中有一个句子，我希望在python中显示，但是我希望在每个完整停止（句点）后开始显示新行。

例如我的段落是

"Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he
paid a lot for it. Did he mind? John Smith, Esq. thinks he didn't.
Nevertheless, this isn't true... Well, with a probability of .9 it
isn't."

但我希望它显示为以下

"Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he
paid a lot for it. 
Did he mind? John Smith, Esq. thinks he didn't. 
Nevertheless, this isn't true... 
Well, with a probability of .9 it isn’t."

对于句子中出现的其他句点，例如在网站地址，“博士”，“Esq”中，这变得越来越困难。 '.9'，当然还有省略号中的前两个点。

我不知道如何处理文本文件中存在的其他时期，任何人都可以帮忙吗？谢谢。

“你的任务是编写一个程序，给出一个文本文件的名称能够写出它将每个句子的内容放在一个单独的行上。“＆lt; - 任务集

Answer 1

这可以解决你的问题：

text = "Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he "\
       "paid a lot for it. Did he mind? John Smith, Esq. thinks he didn't. "\
       "Nevertheless, this isn't true... Well, with a probability of .9 it "\
       "isn't."

import re

pat = ('(?<!Dr)(?<!Esq)\. +(?=[A-Z])')
print re.sub(pat,'.\n',text)

结果

Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he paid a lot for it.
Did he mind? John Smith, Esq. thinks he didn't.
Nevertheless, this isn't true...
Well, with a probability of .9 it isn't.

但是，在人类写作这样复杂的事情中，不可能有一个永远不会失败的正则表达式。
请注意，例如，我不得不使用负面的后观断言来排除博士的情况（我为 Esq。做了同样的事情，尽管它并不代表您的文字中存在问题，因为后面的认为并非以大写字母开头）
我认为将所有相似的案例提前置于正则表达式模式是不可能的，总会有一些不成熟的案例会在某一天发生。

但是，这段代码完成了很多期望的工作。不是很糟糕，我很尊重。

Answer 2

当且仅当点后跟空格和大写字母时，才可以添加换行符。它不会解决所有的情况，但结合使用像“博士”这样的例外词典，你可以做得很好，虽然不是很完美。

<强>更新通过字典我的意思是Python字典和单词列表like this one 我没有找到任何包含最常见缩写的可下载文件，所以我担心你必须自己制作一个。

在文本文件中创建具有句点（句点）的新行

2 个答案: