" wordscount"返回字母而不是单词?

时间:2014-02-05 04:57:50

标签: ruby word-count

我一直试图找出为什么wordscount会返回字母而不是字词,但我不知道原因。

示例测试用例:

count_words("A man, a plan, a canal -- Panama")
# => {'a' => 3, 'man' => 1, 'canal' => 1, 'panama' => 1, 'plan' => 1}

count_words "Doo bee doo bee doo"
# => {'doo' => 3, 'bee' => 2}

以下是代码:

class WordCount

  def count_words(string)
    changed = string.downcase.gsub(/[^a-zA-Z]/,"")
    words = changed.split("")
    counts = Hash.new(0)
    words.each {|x| counts [x] += 1;}
    return counts
  end

end


test = WordCount.new
a = test.count_words("A man, a plan, a canal -- Panama")
b = test.count_words "Doo bee doo bee doo"
puts a
puts b

3 个答案:

答案 0 :(得分:1)

如果您想计算实际(例如' - '不算作单词):

class WordCount
  def count_words(string)
    words = string.scan(/\w+/).group_by(&:downcase)
    Hash[*words.flat_map { |w,a| [w,a.size] }]
  end
end

test = WordCount.new
a = test.count_words "A man, a plan, a canal -- Panama"
b = test.count_words "Doo bee doo bee doo"
puts a # => {"a"=>3, "man"=>1, "plan"=>1, "canal"=>1, "panama"=>1}
puts b # => {"doo"=>3, "bee"=>2}

答案 1 :(得分:1)

  • gsub(/[^a-zA-Z]/,"")删除所有非字母字符。
  • split("")按字符分割字符串。

答案 2 :(得分:0)

我已经简化了你的方法,现在它算上了几个字:

def count_words(string)
   words = string.downcase.gsub(/[^a-zA-Z\s]/,"").split( /\s+/ )
   words.reduce({}) {| h,x | h[x] ||= 0; h[x] += 1;h }
end

count_words("A man, a plan, a canal -- Panama")
# => {"a"=>3, "man"=>1, "plan"=>1, "canal"=>1, "panama"=>1}

注意:请勿在大括号[前放置空格。