我有一个包含许多段落的文本文件,我希望用句子分割,即在每个点之后"。"要么 ?它分裂并包含双Qoutes中的句子,如:
这是一句话。这是一个激动人心的句子!你认为这是一个问题吗?那么呢。
"这是一个句子。"
"这是一个激动人心的句子!"
"您认为这是一个问题吗?"
"那么那就是。"
将所有句子保存在文本文件中。
def splitParagraphIntoSentences(paragraph):
import re
sentenceEnders = re.compile('[.!?]')
sentenceList = sentenceEnders.split(paragraph)
return sentenceList
if __name__ == '__main__':
p = """This is a sentence. This is an excited sentence! And do you think this is a question? so what to do then because many people will say this ok. and then what ?"""
sentences = splitParagraphIntoSentences(p)
for s in sentences:
sentence=(s.strip())
file = open("another.txt", "w")
file.write(sentence)
file.close()
它不起作用,不知道如何用双引号制作每个句子,任何帮助???
答案 0 :(得分:1)
如果我正确理解了您的要求,请尝试将您的代码修改为以下代码:
import re
def splitParagraphIntoSentences(paragraph):
''' break a paragraph into sentences
and return a list '''
sentenceEnders = re.compile('[.!?]')
sentenceList = sentenceEnders.split(paragraph)
return sentenceList
if __name__ == '__main__':
p = "This is a sentence. This is an excited sentence! And do you think this is a question? so what to do then because many people will say this ok. and then what ?"
sentences = splitParagraphIntoSentences(p)
file = open('another.txt', "w")
for s in sentences:
if s.strip():
file.write('"' + s.strip() + '"\n') # Add a newline after each sentence
file.close()
在你的情况下,你需要首先阅读文件,而不是p
,因为你的(我猜)只是一个简化。