Question

我的任务是找到每个句子中的单词。

给定一个字符串，我们想将字符串分成句子，然后确定所有句子中有哪些单词（如果有的话）。

这是我的解决方案：

# encoding: utf-8
text = ''
File.foreach("lab2.in") do |line|
    text += line
end
hash = Hash.new
text = text.gsub(/[\n,]/,'').split(/[!.?]/)
number = 0
text.each do |sen|
        number += 1
        words = sen.split(/ /)
        words.each do |word|
                if hash[word]
                        hash[word] += "#{number}"
                else
                        hash[word] = "#{number}"
                end
        end
end
flag = false
needle = ''
count = text.length
for i in 1..count
        needle += "#{i}"
end
hash.each do |word|
        if word[1].squeeze == needle
                puts "this word is \"#{word[0]}\""
                flag = true
        end
end
if !flag
        puts "There no such word"
end

如何更好地解决这项任务？我对Ruby库方法感兴趣。一个简单的解决方案，比如我已经知道的逐字符循环。

例如，输入如：

lorem ipsum dolor and another lorem! sit amet lorem? and another lorem.

输出将是：

this word is "lorem"

Answer 1

你可以这样做（我稍微修改了你的例子）：

str = "a lorem ipsum lorem dolor sit amet. a tut toje est lorem! a i tuta toje lorem?"  

 str.split(/[.!?]/).map(&:split).reduce(:&)
  #=> ["a", "lorem"]

我们有：

d = str.split(/[.!?]/)
  #=> ["a lorem ipsum lorem dolor sit amet",
  #    " a tut toje est lorem",
  #    " a i tuta toje lorem"] 
e = d.map(&:split)
  #=> [["a", "lorem", "ipsum", "lorem", "dolor", "sit", "amet"],
  #    ["a", "tut", "toje", "est", "lorem"],
  #    ["a", "i", "tuta", "toje", "lorem"]] 
e.reduce(:&)
  #=> ["a", "lorem"]

要使其不区分大小写，请将str.split...更改为str.downcase.split...。

用Ruby查找句子中的常用词

1 个答案: