读完一个文件,看完一个句号后,将句子移到下一行

时间:2015-04-20 10:21:48

标签: ruby

输入文件:

  

Ruby是一种动态的,反思的,面向对象的通用目的   编程语言。它是在20世纪90年代中期设计和开发的   作者:Yukihiro“Matz”Matsumoto在日本。根据其作者,Ruby   受到Perl,Smalltalk,Eiffel,Ada和Lisp的影响。

输出文件:

  

Ruby是一种动态的,反思的,面向对象的通用目的   编程语言。
  它是在20世纪90年代中期设计和开发的   由Yukihiro“Matz”Matsumoto在日本。   据其作者Ruby说   受到Perl,Smalltalk,Eiffel,Ada和Lisp的影响。

1 个答案:

答案 0 :(得分:0)

您可以使用Stanford Natural Language Parser

require "stanfordparser"

input = 'Ruby is a dynamic, reflective, object-oriented, general-purpose programming language. It was designed and developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan. According to its authors, Ruby was influenced by Perl, Smalltalk, Eiffel, Ada, and Lisp.'
preproc = StanfordParser::DocumentPreprocessor.new
puts preproc.getSentencesFromString(input)
# Ruby is a dynamic, reflective, object-oriented, general-purpose programming language. 
# It was designed and developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan. 
# According to its authors, Ruby was influenced by Perl, Smalltalk, Eiffel, Ada, and Lisp.

使用正则表达式:

puts input.split(/((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]"))\s+(?="?[A-Z])/)
# Ruby is a dynamic, reflective, object-oriented, general-purpose programming language.
#
# It was designed and developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan.
# 
# According to its authors, Ruby was influenced by Perl, Smalltalk, Eiffel, Ada, and Lisp.

注意:我假设输入字符串中.According之间有空格。