我正在编写电报Bot帮助我学习德语。
我不想翻译整个段落,而是想逐步翻译每个句子,然后立即翻译,以便我可以面对文字和学习,而不是继续上下滚动。
我是一名正念我的新手。
我想知道是否存在这样的一个。
我分成句子的文字可能是这样的:
This is a sentence.
This is another. And here one another, same line, starting with space.
this sentence starts with lowercase letter.
Here is a site you may know: google.com.
我想得到一个包含类似内容的数组(我现在在这里写每行的一个数组元素):
This is a sentence.
This is another.
And here one another,same line, starting with space.
this sentence starts with lowercase letter.
Here is a site you may know: google.com.
事先谢谢。
答案 0 :(得分:0)
使用nltk
(having installed it correctly即可更好地处理这种情况,即:)
from nltk.tokenize import sent_tokenize
string = "This is a sentence. This is another. And here one another, same line, starting with space. this sentence starts with lowercase letter. Here is a site you may know: google.com."
sent_tokenize_list = sent_tokenize(string)
print(sent_tokenize_list)
# ['This is a sentence.', 'This is another.', 'And here one another, same line, starting with space.', 'this sentence starts with lowercase letter.', 'Here is a site you may know: google.com.']